9 Commits

Author SHA1 Message Date
Oleksandr Bezdieniezhnykh 1ebab29a4f [AZ-332] C1 OKVIS2 Strategy: facade + binding skeleton
Python facade (`Okvis2Strategy`) is production-quality and satisfies
AZ-331's `VioStrategy` protocol; full AC-1..10 coverage with
AC-9 + NFR-perf marked `tier2`. The C++ pybind11 binding compiles
and loads but throws `OkvisFatalException("estimator not yet wired")`
on first `add_frame` — the `okvis::ThreadedKFVio` wiring is a tier2
follow-up the Step-15 Product Completeness Gate is expected to track
as a remediation task.

Resolved contradictions:

* Constructor signature aligned with the AZ-331 factory: `(config, *,
  fdr_client, clock=None)`. Calibration / preintegrator / logger
  built internally from config. No churn on AZ-331.
* IMU substrate: OKVIS2 owns its internal estimator IMU integration;
  the AZ-276 `ImuPreintegrator` is a separate substrate consumed by
  E-C5's fusion graph. Single source of truth lives at the sample
  stream, not the integrator instance.
* FDR API: `FdrClient.enqueue(record)` with new `vio.health` kind
  added to AZ-272 `KNOWN_PAYLOAD_KEYS`.

CI matrix forces `-DBUILD_OKVIS2=OFF` until the tier2 wiring task
brings Ceres / SuiteSparse / OKVIS2 vendored submodules into the
Linux build.

Files: 17 added/modified across `c1_vio/`, `fdr_client/records.py`,
`cpp/okvis2/CMakeLists.txt`, CI workflow, AZ-332 task spec
(implementation-notes section), batch 23 report.

Tests: 17 new (15 tier1 + 2 tier2). Full Tier-1 suite: 1109 pass,
2 skipped (env), 2 deselected (tier2). No regressions.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 09:56:45 +03:00
Oleksandr Bezdieniezhnykh 9c35776bcb chore: pre-batch-23 carry-over (state + AZ-332 plan)
Handoff artifacts from the prior /autodev session that stopped at
Step 7 sub_step compute-next-batch:

- _docs/_autodev_state.md: pointer updated to batch 23, AZ-332 only
  (AZ-345 deferred — dep AZ-346 not yet in done/).
- _docs/03_implementation/AZ-332_implementation_plan.md: locked-in
  decisions (no ROS 2, no Python re-impl, three-env split: macOS dev /
  Ubuntu CI / Jetson tier2) + step-by-step playbook for next session.

Pre-batch chore commit per implement skill prereq #4 (clean tree
required before AZ-332 commit so the batch diff stays focused).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 09:18:20 +03:00
Oleksandr Bezdieniezhnykh 48ea1e2fc2 [AZ-343] C2.5 InlierCountReRanker + shared FeatureExtractor helper
Implements the production-default ReRankStrategy: K=10 → N=3 by
single-pair LightGlue inlier count, with strict drop-and-continue
(INV-8) on per-candidate TileFetch / backbone / zero-inlier failures
and RerankAllCandidatesFailedError on zero survivors. Composition
root injects the shared LightGlueRuntime + Clock + the new
FeatureExtractor helper (an L1 placeholder OpenCvOrbExtractor that
unblocks AZ-343 and future C3 strategies — task scope expansion).

Architectural notes:
- Cross-component imports stay banned; tile_store types as `object`
  and the C6 TileCacheError family is duck-typed by class module
  prefix (same workaround AZ-348 adopted for c7_inference; proper
  fix is to relocate TileCacheError to _types/ in a follow-up).
- Clock injection follows the replay contract (AZ-398 Invariant 2);
  reranked_at is sourced from clock.monotonic_ns().
- AZ-342 factory grew `feature_extractor` + `clock` + `fdr_client`
  parameters; existing AZ-342 conformance tests updated.

Tests: 19 new AC-1..AC-12 + mixed-failure scenarios in
test_inlier_count_reranker.py; existing AZ-342 suite (26) still
green. Full repo sweep 1093 passed / 2 skipped (cmake/actionlint
not on PATH).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 06:22:40 +03:00
Oleksandr Bezdieniezhnykh 9a605c8514 [AZ-348] C3.5 ConditionalRefiner Protocol + factory + PassthroughRefiner
Defines the public `ConditionalRefiner` Protocol (PEP 544
@runtime_checkable, two methods: `refine_if_needed` +
`was_invoked`), extends `MatchResult` in-place with two
default-valued refinement fields (`refinement_label`,
`refinement_added_latency_ms`), defines the `RefinerError` family
(`RefinerBackboneError`, `RefinerConfigError`), and ships the
trivial `PassthroughRefiner` reference impl.

Both refiner strategies are linked unconditionally — no
`BUILD_REFINER_*` flag (NOT ADR-002 territory). Runtime selection
only per ADR-001. `PassthroughRefiner` returns the input
`MatchResult` by reference (bit-identical correspondences per
contract INV-5) and always reports `was_invoked() is False`.

Documentation: renames `module-layout.md` `c3_5_adhop` Public API
symbol from `AdHoPRefinementStrategy` to `ConditionalRefiner`
(AC-14) so the doc agrees with `description.md` and the contract.

AC-9 (single-thread binding) deferred to AZ-270 runtime-root
composition, mirroring AZ-336 / AZ-342 / AZ-344 Risk-4 precedent.
AC-7 for the `"adhop"` strategy stops at `ModuleNotFoundError`
because the AdHoP backbone is owned by AZ-349. All other ACs +
NFRs covered by 36 new conformance tests.

Architectural note: `PassthroughRefiner.inference_runtime` is
typed as `object` because the L3→L3 import ban
(`test_az270_compose_root`) forbids c3_5_adhop from importing
c7_inference; the runtime-root factory narrows the type at
construction time.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 05:52:36 +03:00
Oleksandr Bezdieniezhnykh 89c223882b [AZ-344] C3 CrossDomainMatcher Protocol + factory + RollingHealthWindow
Defines the public `CrossDomainMatcher` Protocol (PEP 544
@runtime_checkable, two methods: `match` + `health_snapshot`),
the three frozen+slotted DTOs (`CandidateMatchSet`, `MatchResult`,
`MatcherHealth`) in the L1 `_types/matcher.py` layer, the
`MatcherError` family (`MatcherBackboneError`,
`InsufficientInliersError`), and the composition-root
`build_matcher_strategy` factory with lazy-import +
`BUILD_MATCHER_<variant>` gating per ADR-002.

`RollingHealthWindow` accumulator (60 s, amortised O(1) update,
strict O(1) snapshot) is constructed by the factory and injected
into every concrete matcher so all backbones share window
semantics; this is what backs C5's spoof-promotion gate.

Legacy placeholder `MatchResult` removed from `_types/matching.py`;
import-only consumers (`c4_pose.interface`, `c3_5_adhop.interface`)
repointed at the new `_types/matcher.py` home — zero behavioural
change to those components.

AC-9 (single-thread binding) and AC-10 (LightGlueRuntime
identity-share with C2.5) deferred to AZ-270 runtime-root
composition, mirroring the AZ-342 Risk-4 escape clause. All other
ACs + NFRs covered by 70 new conformance tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 05:43:33 +03:00
Oleksandr Bezdieniezhnykh d6756f1855 [AZ-342] C2.5 ReRankStrategy: Protocol + DTOs + factory + composition
Foundational scaffolding for the InlierCountReRanker (AZ-343) and
the future C3 CrossDomainMatcher consumer (AZ-344). No concrete
re-ranker is implemented here.

* ReRankStrategy Protocol (single rerank(frame, vpr_result, n,
  calibration) -> RerankResult method) with all 8 invariants in the
  docstring — notably INV-8 drop-and-continue (per-candidate failure
  NEVER propagates unless every candidate fails).
* DTOs moved to L1 _types/rerank.py — RerankCandidate, RerankResult;
  frozen+slots; tuple-not-list for RerankResult.candidates; tile_id
  encoded as (zoom_level, lat, lon) tuple to keep _types/ free of any
  c6_tile_cache (L3) import per module-layout.md.
* Error family: RerankError + RerankBackboneError +
  RerankAllCandidatesFailedError. Only RerankAllCandidatesFailedError
  escapes rerank(); RerankBackboneError is caught inside the per-
  candidate loop, logged ERROR, FDR-stamped, candidate dropped.
* C2_5RerankConfig (strategy enum default "inlier_count", top_n int
  default 3) with strict validation at load; registered into
  Config.components on c2_5_rerank import.
* build_rerank_strategy(config, *, tile_store, lightglue_runtime)
  factory: 1-strategy resolution table, lazy import,
  BUILD_RERANK_<variant> gate, ImportError → StrategyNotAvailableError
  mapping. The shared LightGlueRuntime is constructor-injected
  (R14 fix: neither C2.5 nor C3 owns its lifecycle).

Renamed the Protocol from the existing stub "RerankStrategy" to
"ReRankStrategy" to match the contract; updated module-layout.md.
Removed the legacy RerankResult shape from _types/vpr.py — the
v1.0.0 shape lives in _types/rerank.py.

Excluded per task spec:
* Concrete InlierCountReRanker (AZ-343).
* C3 matcher protocol task (AZ-344, next in batch).
* AC-9 single-thread binding + AC-10 LightGlueRuntime identity-share
  between C2.5/C3 — deferred per task spec Risk 3 until the generic
  compose_root thread-binding registry and the C3 factory both land.

Tests: AC-1..AC-8 + AC-11 + NFR-perf-factory in
tests/unit/c2_5_rerank/test_protocol_conformance.py. The legacy
smoke test is removed. Full sweep: 997 passed (one pre-existing
flake in test_az296_takeoff_abort, subprocess timing, unrelated to
this commit; passes in isolation).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 05:31:27 +03:00
Oleksandr Bezdieniezhnykh 3665acef66 [AZ-336] C2 VprStrategy: Protocol + DTOs + factory + composition
Foundational scaffolding for every concrete C2 backbone (UltraVPR,
NetVLAD, MegaLoc, MixVPR, SelaVPR, EigenPlaces, SALAD — AZ-337..AZ-340)
and the C2.5 ReRanker consumer side. No backbone is implemented here.

* VprStrategy Protocol (embed_query / retrieve_topk / descriptor_dim)
  + BackbonePreprocessor C2-internal Protocol (NOT in Public API per
  description.md § 6).
* DTOs in L1 _types/vpr.py — VprQuery, VprCandidate, VprResult; all
  frozen + slots; tuple-not-list for VprResult.candidates so the
  immutability invariant truly holds.
* Error family: VprError + VprBackboneError + VprPreprocessError +
  IndexUnavailableError; same-named but namespace-distinct from
  c6_tile_cache.IndexUnavailableError (the c2 family is the closed
  envelope C5 / C2.5 consume; concrete strategies rewrap the C6 form).
* C2VprConfig (strategy enum + backbone_weights_path + faiss_index_path)
  with strict validation at load; registered into Config.components on
  c2_vpr import.
* build_vpr_strategy factory with 7-strategy resolution table, lazy
  import, BUILD_VPR_<variant> gating, ImportError→
  StrategyNotAvailableError mapping, and pre-flight descriptor_dim
  match against DescriptorIndex.descriptor_dim() — mismatch fires
  ConfigError at startup, NOT at first frame.

Contract change vs the v1.0.0 draft: factory takes descriptor_index:
DescriptorIndex (not tile_store: TileStore) because descriptor_dim()
lives on DescriptorIndex per C6's Public API. The contract markdown
is updated to match.

Architecture: VprCandidate.tile_id is a plain (zoom, lat, lon) tuple,
keeping _types/ (L1) free of any c6_tile_cache (L3) import per
module-layout.md. Consumers reconstruct TileId at the C6 boundary.

Excluded per task spec:
* Concrete backbones (AZ-337..AZ-340).
* FAISS HNSW retrieve wiring (AZ-341).
* DescriptorNormaliser helper (AZ-283, already shipped).
* AC-9 single-thread binding — deferred per task spec Risk 4 until the
  generic compose_root thread-binding registry is in place (today
  each factory owns its own, e.g. fc_factory).

Tests: 45 ACs + NFRs in tests/unit/c2_vpr/test_protocol_conformance.py
covering AC-1..AC-8, the error family, the config validation, the
factory NFR (p99 ≤ 50 ms). The legacy smoke test is removed. Full
sweep 973 passed, 2 skipped (CI-only cmake / actionlint).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 05:25:35 +03:00
Oleksandr Bezdieniezhnykh 823c0f1b2e [AZ-398] Replay: FrameSource + Clock Protocols + Clock injection
Ship the two Layer-1 cross-cutting Protocols replay mode needs to leave
production C1-C5 components mode-agnostic (Invariant 1) and replay-
deterministic (Invariant 2). Live + replay binaries see the same
interfaces; only the strategy differs.

* Clock Protocol (monotonic_ns / time_ns / sleep_until_ns) +
  WallClock (live + REALTIME replay) + TlogDerivedClock (ASAP replay;
  advance-on-call; non-monotonic source → ClockOrderingError).
* FrameSource Protocol (next_frame -> NavCameraFrame | None / close)
  + LiveCameraFrameSource (cv2.VideoCapture device index) +
  VideoFileFrameSource (cv2.VideoCapture file).
* Build-flag gating: BUILD_VIDEO_FILE_FRAME_SOURCE,
  BUILD_LIVE_CAMERA_FRAME_SOURCE (constructor-time check; Tier-0 OFF
  refuses construction with FrameSourceConfigError).
* Composition-root factories: build_clock + build_frame_source.
* Injected Clock across every component that previously called
  time.monotonic_ns() / time.sleep() directly: c5_state (estimator,
  ESKF, fallback watcher, source-label SM, isam2 handle), c8_fc_adapter
  (inbound MAVLink + MSP2, AP outbound, iNav outbound, QGC GCS),
  c13_fdr writer, c12_operator_tooling httpx flights client. All
  constructors default to WallClock() so existing call sites keep
  live-binary behaviour without a wiring change.
* AC-4 CI guard (tests/_meta/test_no_direct_time_in_components.py)
  AST-scans components/**/*.py for direct time.monotonic_ns /
  time.time_ns / time.sleep references and fails loudly with file:line.
* Conformance + factory tests: tests/unit/clock + tests/unit/frame_source.
* Test fixture updates: FallbackWatcher / SourceLabelStateMachine
  clock_ns is now required (removed time.monotonic_ns default);
  test_az388 patches estimator._clock instead of a module-level time;
  test_az393 ardupilot adapter uses a _FixedClock test double.

Excluded per the task spec: TlogReplayFcAdapter (AZ-399), ReplaySink
(AZ-400), compose_replay (AZ-401), CLI (AZ-402), Docker/CI (AZ-403),
E2E fixture (AZ-404), IMU auto-sync (AZ-405).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 05:10:01 +03:00
Oleksandr Bezdieniezhnykh 6c7d24f7e0 [AZ-331] C1 VioStrategy: Protocol + DTOs + factory + C5 migration
Freezes the c1_vio Public API per
_docs/02_document/contracts/c1_vio/vio_strategy_protocol.md v1.0.0:

- VioStrategy Protocol (4 methods: process_frame, reset_to_warm_start,
  health_snapshot, current_strategy_label) in
  components/c1_vio/interface.py.
- DTOs (VioOutput, VioHealth, FeatureQuality, WarmStartPose) + VioState
  enum in _types/nav.py — L1 placement so C5 + C13 consume them without
  crossing the components.* boundary (AZ-270 AC-6). The new VioOutput
  shape (frame_id: str, relative_pose_T: gtsam.Pose3,
  pose_covariance_6x6, imu_bias, feature_quality, emitted_at_ns)
  replaces the AZ-263 scaffolding in _types/vio.py, which is now
  deleted.
- VioError family (VioInitializingError / VioDegradedError /
  VioFatalError) in components/c1_vio/errors.py. Documented
  rationale: the degraded-operation path returns a VioOutput with
  inflated covariance + VioHealth.state=DEGRADED rather than raising
  VioDegradedError — the error type exists only for the rare
  degraded->fatal transition.
- C1VioConfig per-component config block (strategy enum,
  lost_frame_threshold default 9, warm_start_max_frames default 5)
  with constructor-time validation rejecting unknown strategy labels.
- StrategyNotAvailableError added to runtime_root/errors.py;
  composition-time error distinct from the VioError family.
- Composition-root factory build_vio_strategy in
  runtime_root/vio_factory.py with three BUILD_* gates (BUILD_OKVIS2,
  BUILD_VINS_MONO, BUILD_KLT_RANSAC). Concrete strategy modules are
  imported lazily via __import__ AFTER the flag check — Tier-0
  workstation builds with the flag OFF MUST NOT load the strategy
  module (Risk-2 / I-5; verifiable via sys.modules).
- 36 conformance tests cover all 9 ACs + NFR-perf-factory
  (p99 build under 200 ms x 1000 calls) + NFR-reliability-error-family.
  AC-8 introspects the contract file's Shape table and asserts method
  parity against the runtime Protocol; AC-9 asserts the frame_id
  annotation is 'str' (PEP-563 stringified).

C5 migration (consumers of the new VioOutput shape):
- gtsam_isam2_estimator.py + eskf_baseline.py: replaced
  vio.timestamp -> vio.emitted_at_ns (drops _datetime_to_ns on the
  VIO path), vio.pose_se3 -> vio.relative_pose_T (gtsam.Pose3 direct;
  drops _pose_se3_to_gtsam / _pose_se3_to_array), vio.covariance_6x6
  -> vio.pose_covariance_6x6 (rename).
- key_for_frame signature widened to UUID | int | str to accept the
  new str frame_id.
- 4 C5 test files migrated to the new VioOutput shape with helper
  fixtures producing ImuBias + FeatureQuality + str frame_id.
- c5_state/interface.py TYPE_CHECKING import path updated.

Bootstrap healthcheck + test_types_importable updated to drop the
deleted _types/vio module and pick up _types/inference (AZ-297) in
the same sweep.

Full unit-test sweep: 884 passed, 2 pre-existing environment skips
(cmake, actionlint).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 04:44:31 +03:00
117 changed files with 10997 additions and 373 deletions
+12 -2
View File
@@ -47,10 +47,20 @@ jobs:
matrix:
kind: [deployment, research]
include:
# AZ-332 — BUILD_OKVIS2 forced OFF in Tier-1 CI until the tier2
# follow-up wires `okvis::ThreadedKFVio` end-to-end. The C++
# binding skeleton + CMake glue still ship in this build; full
# OKVIS2 native compile is gated on installing Ceres-solver +
# OKVIS2 vendored submodules (BRISK, DBoW2) via apt, plus
# `submodules: recursive` checkout. That CI lift is the
# tier2 task's surface, not AZ-332's.
- kind: deployment
cmake_flags: "-DBUILD_VINS_MONO=OFF -DBUILD_VPR_SALAD=OFF -DBUILD_C11_TILE_MANAGER=OFF"
cmake_flags: >-
-DBUILD_OKVIS2=OFF -DBUILD_VINS_MONO=OFF
-DBUILD_VPR_SALAD=OFF -DBUILD_C11_TILE_MANAGER=OFF
- kind: research
cmake_flags: "-DBUILD_VINS_MONO=ON -DBUILD_VPR_SALAD=ON"
cmake_flags: >-
-DBUILD_OKVIS2=OFF -DBUILD_VINS_MONO=ON -DBUILD_VPR_SALAD=ON
steps:
- uses: actions/checkout@v4
- run: cmake -S . -B build ${{ matrix.cmake_flags }}
+6
View File
@@ -0,0 +1,6 @@
[submodule "cpp/pybind11/upstream"]
path = cpp/pybind11/upstream
url = https://github.com/pybind/pybind11.git
[submodule "cpp/okvis2/upstream"]
path = cpp/okvis2/upstream
url = https://github.com/smartroboticslab/okvis2.git
@@ -4,11 +4,12 @@
**Purpose**: re-rank C2's top-K=10 VPR candidates down to top-N=3 by single-pair LightGlue inlier count, producing a higher-precision input for the cross-domain matcher (C3). The re-rank step is the architectural boundary between cheap descriptor retrieval (C2) and expensive cross-domain matching (C3) — it pays a small extra cost so C3 only operates on the most promising candidates.
**Architectural Pattern**: Strategy (single concrete implementation today: `InlierCountReRanker`). Future re-rank algorithms can be added as additional `ReRankStrategy` implementations behind the same interface.
**Architectural Pattern**: Strategy (single concrete implementation today: `InlierCountReRanker`, AZ-343). Future re-rank algorithms can be added as additional `ReRankStrategy` implementations behind the same interface.
**Upstream dependencies**:
- C2 → `VprResult` (top-K=10 candidates).
- Shared `LightGlueRuntime` helper (used in single-pair mode for inlier counting; the same matcher object is shared with C3 — owned by the helper, not by C3, so neither component depends on the other at build time).
- Shared `FeatureExtractor` helper (`helpers/feature_extractor.py`, AZ-343 scope expansion) — extracts `KeypointSet` from both the per-frame nav image and each candidate's tile JPEG; the placeholder impl is `OpenCvOrbExtractor`, swapped out for a TRT-backed deep extractor before flight.
- C6 TileStore → fetch tile pixels for each candidate (cheap, in-memory page-cache hit during a flight).
- Camera calibration artifact — for nav-frame preprocessing.
@@ -59,7 +60,7 @@ No caching layer beyond C6's mmap. The same tile may be fetched repeatedly acros
**Algorithmic Complexity**: `O(K)` LightGlue forward passes per frame (K=10), each `O(M_tile · M_query)` in feature counts. The whole step is GPU-bound on the same engine that C3 uses — hence the shared LightGlue runtime.
**State Management**: stateless per-frame. Holds a reference to the shared LightGlue object owned by C3.
**State Management**: stateless per-frame. Holds references to the constructor-injected `LightGlueRuntime`, `FeatureExtractor`, `TileStore`, `Clock`, and (optionally) `FdrClient` — all lifecycle-owned by the runtime root, not by C2.5.
**Key Dependencies**:
@@ -78,6 +79,8 @@ No caching layer beyond C6's mmap. The same tile may be fetched repeatedly acros
| Helper | Purpose | Used By |
|--------|---------|---------|
| `LightGlueRuntime` | shared LightGlue inference handle (one engine, many call sites) | C2.5, C3 |
| `FeatureExtractor` (`helpers/feature_extractor.py`) | shared image → `KeypointSet` extractor; default `OpenCvOrbExtractor`, target TRT-backed DISK/ALIKED | C2.5, future C3 backbones |
| `Clock` (`gps_denied_onboard.clock`) | composition-root time source; stamps `RerankResult.reranked_at` via `clock.monotonic_ns()` (Invariant 2 of the replay contract — no direct `time.*` in components) | every C* component |
## 7. Caveats & Edge Cases
@@ -4,7 +4,7 @@
**Producer task**: AZ-342 (`ReRankStrategy` Protocol + factory + composition)
**Consumer tasks**: AZ-343 (`InlierCountReRanker` impl); downstream c3_matcher (epic AZ-257 / E-C3 — TBD at AZ-257 decompose time) which consumes `RerankResult`
**Version**: 1.0.0
**Status**: draft, awaiting AZ-342 implementation
**Status**: v1.0.0 (AZ-342 implemented 2026-05-12)
**Last Updated**: 2026-05-10
**Module-layout home**: `src/gps_denied_onboard/components/c2_5_rerank/interface.py` (Protocol), `src/gps_denied_onboard/components/c2_5_rerank/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/rerank_factory.py` (factory)
@@ -75,31 +75,29 @@ class ReRankStrategy(Protocol):
```python
from dataclasses import dataclass
from uuid import UUID
import numpy as np
@dataclass(frozen=True, slots=True)
class RerankCandidate:
"""One re-rank survivor. Carries the C2-stage descriptor_distance forward for FDR provenance plus the new inlier_count from single-pair LightGlue."""
tile_id: tuple # composite (zoomLevel, lat, lon); see C6 TileRecord
inlier_count: int # single-pair LightGlue inliers; > 0 for any survivor
descriptor_distance: float # carried forward from C2's VprCandidate
descriptor_dim: int # carried forward from C2 for sanity assertions
tile_pixels_handle: object # opaque page-cache-backed pixel reference; see C6 TileStore contract
tile_id: tuple[int, float, float] # composite (zoom_level, lat, lon); matches c6_tile_cache.TileId. tuple form keeps _types/ free of an L1→L3 import per module-layout.md.
inlier_count: int # single-pair LightGlue inliers; > 0 for any survivor
descriptor_distance: float # carried forward from C2's VprCandidate
descriptor_dim: int # carried forward from C2 for sanity assertions
tile_pixels_handle: object # opaque page-cache-backed pixel reference; see C6 TileStore contract
@dataclass(frozen=True, slots=True)
class RerankResult:
"""Top-N survivors from `ReRankStrategy.rerank`. Consumed by C3 CrossDomainMatcher."""
frame_id: UUID
candidates: list[RerankCandidate] # 0 < len <= n; sorted descending by inlier_count, ties broken by descriptor_distance ascending
reranked_at: int # monotonic_ns
rerank_label: str # non-empty; matches BUILD_RERANK_<variant> lowercase (e.g., "inlier_count")
candidates_input: int # len(vpr_result.candidates) at entry — for FDR observability
candidates_dropped: int # candidates_input - len(candidates)
frame_id: int # echoes NavCameraFrame.frame_id (int across the pipeline)
candidates: tuple[RerankCandidate, ...] # 0 < len <= n; descending by inlier_count, ties broken by descriptor_distance ascending. tuple (not list) so the frozen+slots invariant holds.
reranked_at: int # monotonic_ns from injected Clock
rerank_label: str # non-empty; matches BUILD_RERANK_<variant> lowercase
candidates_input: int # len(vpr_result.candidates) at entry — for FDR observability
candidates_dropped: int # candidates_input - len(candidates)
```
### Error Hierarchy (in `c2_5_rerank/errors.py`)
@@ -4,7 +4,7 @@
**Producer task**: AZ-336 (`VprStrategy` Protocol + factory + composition)
**Consumer tasks**: AZ-337 (UltraVPR), AZ-338 (NetVLAD baseline), AZ-339 (MegaLoc + MixVPR), AZ-340 (SelaVPR + EigenPlaces + SALAD), AZ-341 (FAISS HNSW retrieve wiring), and downstream c2_5_rerank (AZ-256 / E-C2.5)
**Module-layout home**: `src/gps_denied_onboard/components/c2_vpr/interface.py` (Protocols), `src/gps_denied_onboard/components/c2_vpr/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/vpr_factory.py` (factory)
**Status**: draft, awaiting AZ-336 implementation
**Status**: v1.0.0 (AZ-336 implemented 2026-05-12)
## Purpose
@@ -69,25 +69,23 @@ class VprStrategy(Protocol):
```python
from dataclasses import dataclass
from uuid import UUID
import numpy as np
@dataclass(frozen=True, slots=True)
class VprQuery:
"""Backbone embedding for a single nav-camera frame. Produced by `VprStrategy.embed_query`; consumed by `VprStrategy.retrieve_topk` (same instance) or — in the C10 corpus-build path — by `DescriptorIndexBuilder` to populate the corpus descriptor matrix."""
frame_id: UUID
embedding: np.ndarray # shape (D,), dtype float16 or float32; L2-normalised
produced_at: int # monotonic_ns
frame_id: int # echoes NavCameraFrame.frame_id (the source carries int across the pipeline)
embedding: object # numpy.ndarray, shape (D,), dtype float16|float32; L2-normalised. typed object to keep _types/ free of numpy import-time dep.
produced_at: int # monotonic_ns from injected Clock
@dataclass(frozen=True, slots=True)
class VprCandidate:
"""One retrieval candidate from the top-K result."""
tile_id: tuple # composite (zoomLevel, lat, lon); see C6 TileRecord
descriptor_distance: float # backbone-specific metric (cosine for L2-normalised; Euclidean for raw)
tile_id: tuple[int, float, float] # composite (zoom_level, lat, lon); matches c6_tile_cache.TileId. tuple form keeps _types/ free of an L1→L3 import per module-layout.md.
descriptor_distance: float # backbone-specific metric (cosine for L2-normalised; Euclidean for raw)
descriptor_dim: int
@@ -95,10 +93,10 @@ class VprCandidate:
class VprResult:
"""Top-K candidates from `VprStrategy.retrieve_topk`. Consumed by C2.5 ReRanker."""
frame_id: UUID
candidates: list[VprCandidate] # length == k, sorted ascending by descriptor_distance
retrieved_at: int # monotonic_ns
backbone_label: str # non-empty; matches BUILD_VPR_<variant> lowercase
frame_id: int # echoes the source NavCameraFrame.frame_id
candidates: tuple[VprCandidate, ...] # length == k, ascending by descriptor_distance. tuple (not list) so the frozen+slots invariant holds.
retrieved_at: int # monotonic_ns from injected Clock
backbone_label: str # non-empty; matches BUILD_VPR_<variant> lowercase
```
### Protocol: `BackbonePreprocessor` (C2-internal; lives in `c2_vpr/_preprocessor.py`)
@@ -155,18 +153,21 @@ class IndexUnavailableError(VprError):
# src/gps_denied_onboard/runtime_root/vpr_factory.py
from typing import TYPE_CHECKING
from gps_denied_onboard.config import Config
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.components.c2_vpr import VprStrategy
from gps_denied_onboard.components.c6_tile_cache import TileStore
from gps_denied_onboard.components.c6_tile_cache import DescriptorIndex
from gps_denied_onboard.components.c7_inference import InferenceRuntime
def build_vpr_strategy(
config: Config,
tile_store: TileStore,
*,
descriptor_index: DescriptorIndex,
inference_runtime: InferenceRuntime,
) -> VprStrategy:
"""Composition-root factory. Reads `config.vpr.strategy` and `config.vpr.backbone_weights_path`; lazy-imports the concrete strategy module gated by its CMake `BUILD_VPR_<variant>` flag; refuses to instantiate a strategy whose flag is OFF (raises `ConfigurationError` pointing at the offending strategy name + missing flag).
"""Composition-root factory. Reads `config.components['c2_vpr'].strategy` and `config.components['c2_vpr'].backbone_weights_path`; lazy-imports the concrete strategy module gated by its CMake `BUILD_VPR_<variant>` flag; refuses to instantiate a strategy whose flag is OFF (raises `StrategyNotAvailableError` pointing at the offending strategy name + missing flag).
`descriptor_index` (NOT `tile_store`) is injected: the pre-flight `descriptor_dim` validation reads from the C6 `DescriptorIndex.descriptor_dim()` which is the Public API that owns the FAISS index sidecar. The contract draft earlier named this parameter `tile_store`; the implementation moved it to match C6's actual Public API.
Strategy resolution table:
@@ -180,9 +181,9 @@ def build_vpr_strategy(
| "eigen_places" | EigenPlacesStrategy | components.c2_vpr.eigen_places | BUILD_VPR_EIGENPLACES |
| "salad" | SaladStrategy | components.c2_vpr.salad | BUILD_VPR_SALAD |
Pre-flight validation: after constructing the strategy, the factory queries `strategy.descriptor_dim()` and asserts it matches the C6 corpus index's declared `descriptor_dim` (read from the FAISS index sidecar). Mismatch → `ConfigurationError` at startup, NOT at first frame.
Pre-flight validation: after constructing the strategy, the factory queries `strategy.descriptor_dim()` and asserts it matches `descriptor_index.descriptor_dim()` (the FAISS index sidecar value). Mismatch → `ConfigError` at startup, NOT at first frame.
Returns a fully-constructed strategy ready for `embed_query` / `retrieve_topk` invocation. The caller (runtime root) is responsible for binding the instance to one ingest thread.
Returns a fully-constructed strategy ready for `embed_query` / `retrieve_topk` invocation. The caller (runtime root) is responsible for binding the instance to one ingest thread (AC-9 deferred until the generic compose_root thread-binding registry is in place; see task spec Risk 4).
"""
...
```
@@ -4,8 +4,8 @@
**Producer task**: AZ-348 (Protocol + factory + DTOs + composition + `PassthroughRefiner`)
**Consumer tasks**: AZ-349 (`AdHoPRefiner` real refinement); downstream c4_pose (epic AZ-259) which consumes the (possibly refined) `MatchResult`
**Version**: 1.0.0
**Status**: draft, awaiting Producer task implementation
**Last Updated**: 2026-05-10
**Status**: v1.0.0 (AZ-348 implemented 2026-05-12; PassthroughRefiner shipped — AdHoPRefiner pending AZ-349)
**Last Updated**: 2026-05-12
**Module-layout home**: `src/gps_denied_onboard/components/c3_5_adhop/interface.py` (Protocol), `src/gps_denied_onboard/components/c3_5_adhop/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/refiner_factory.py` (factory)
> **Public API symbol naming.** The component's public interface symbol is named `ConditionalRefiner` in `description.md` § 2 and `AdHoPRefinementStrategy` in `module-layout.md` § c3_5_adhop. Both refer to the SAME Protocol; the canonical class name in code is `ConditionalRefiner` — it is the role description-first name and matches the method `refine_if_needed`. The producer task ALSO updates `module-layout.md` to align (`AdHoPRefinementStrategy` → `ConditionalRefiner`) so the two documents agree.
@@ -4,8 +4,8 @@
**Producer task**: AZ-344 (`CrossDomainMatcher` Protocol + factory + composition)
**Consumer tasks**: AZ-345 (DISK+LightGlue primary), AZ-346 (ALIKED+LightGlue secondary), AZ-347 (XFeat alternate); downstream c3_5_adhop (epic AZ-258) which consumes `MatchResult`
**Version**: 1.0.0
**Status**: draft, awaiting AZ-344 implementation
**Last Updated**: 2026-05-10
**Status**: v1.0.0 (AZ-344 implemented 2026-05-12)
**Last Updated**: 2026-05-12
**Module-layout home**: `src/gps_denied_onboard/components/c3_matcher/interface.py` (Protocol), `src/gps_denied_onboard/components/c3_matcher/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/matcher_factory.py` (factory)
## Purpose
@@ -65,14 +65,13 @@ class CrossDomainMatcher(Protocol):
```python
from dataclasses import dataclass
from uuid import UUID
import numpy as np
@dataclass(frozen=True, slots=True)
class CandidateMatchSet:
"""Per-candidate matching outcome inside a MatchResult."""
tile_id: tuple # composite (zoomLevel, lat, lon)
tile_id: tuple[int, float, float] # composite (zoomLevel, lat, lon); mirrors VprCandidate / RerankCandidate encoding so the L1 _types layer is free of an L1→L3 import to c6_tile_cache.TileId
inlier_count: int
inlier_correspondences: np.ndarray # shape (I, 4) float32; (px_query, py_query, px_tile, py_tile)
ransac_outlier_count: int
@@ -82,8 +81,8 @@ class CandidateMatchSet:
@dataclass(frozen=True, slots=True)
class MatchResult:
"""Cross-domain match outcome for one frame. Consumed by C3.5 ConditionalRefiner."""
frame_id: UUID
per_candidate: list[CandidateMatchSet] # 0 < len <= N=3, ranked by inlier_count descending; ties broken by per_candidate_residual_px ascending
frame_id: int # mirrors NavCameraFrame.frame_id; matches AZ-336 / AZ-342 encoding
per_candidate: tuple[CandidateMatchSet, ...] # 0 < len <= N=3, ranked by inlier_count descending; ties broken by per_candidate_residual_px ascending. tuple (not list) so frozen+slots actually holds.
best_candidate_idx: int # 0 by construction (sorted)
reprojection_residual_px: float # best candidate's median residual
matched_at: int # monotonic_ns
@@ -115,6 +114,8 @@ class InsufficientInliersError(MatcherError):
"""Every candidate failed OR every candidate's inlier count is below `config.matcher.min_inliers_threshold`. Raised by `match`. C5 falls back to VIO-only."""
```
The composition-time selection error is **`StrategyNotAvailableError`** (`runtime_root.errors`), NOT a member of `MatcherError`: it surfaces when the binary lacks the requested `BUILD_MATCHER_<variant>` flag or the concrete strategy module is not built yet (AZ-345..AZ-347 pending). This matches the C2 VPR (AZ-336) and C2.5 ReRank (AZ-342) factory pattern: per-frame matcher errors live in the C3 family; composition-time selection errors live in the shared runtime-root family.
## Composition-Root Factory
```python
+32 -16
View File
@@ -57,12 +57,14 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
- **Epic**: AZ-256 (E-C2.5 Rerank)
- **Directory**: `src/gps_denied_onboard/components/c2_5_rerank/`
- **Public API**:
- `__init__.py` (re-exports `RerankStrategy`, `RerankResult`)
- `interface.py` (`RerankStrategy` Protocol)
- `__init__.py` (re-exports `ReRankStrategy`, `RerankResult`, `RerankCandidate`, `RerankError` family, `C2_5RerankConfig`)
- `interface.py` (`ReRankStrategy` Protocol)
- `config.py` (`C2_5RerankConfig` dataclass; registered on import; `strategy`, `top_n`, `debug_per_frame_log` fields)
- `errors.py` (`RerankError`, `RerankBackboneError`, `RerankAllCandidatesFailedError`)
- **Internal**:
- `inlier_based_reranker.py` (single-pair LightGlue inlier count K=10→N=3)
- **Owns**: `src/gps_denied_onboard/components/c2_5_rerank/**`, `tests/unit/c2_5_rerank/**`
- **Imports from**: `_types`, `helpers.lightglue_runtime`, `helpers.descriptor_normaliser`, `helpers.ransac_filter`, `helpers.se3_utils`, `components.c6_tile_cache` (Public API), `components.c7_inference`, `config`, `logging`, `fdr_client`
- `inlier_based_reranker.py` (`InlierCountReRanker`single-pair LightGlue inlier count K=10→N=3, AZ-343; module-level `create()` factory entry-point consumed by `runtime_root.rerank_factory.build_rerank_strategy`; gated by `BUILD_RERANK_INLIER_COUNT`)
- **Owns**: `src/gps_denied_onboard/components/c2_5_rerank/**`, `src/gps_denied_onboard/runtime_root/rerank_factory.py`, `tests/unit/c2_5_rerank/**`
- **Imports from**: `_types`, `helpers.lightglue_runtime`, `helpers.feature_extractor` (AZ-343 scope expansion), `helpers.descriptor_normaliser`, `helpers.ransac_filter`, `helpers.se3_utils`, `components.c6_tile_cache` (Public API only — `TileStore`, `TilePixelHandle`, `TileCacheError` family), `clock`, `config`, `logging`, `fdr_client`
- **Consumed by**: `c3_matcher`, `runtime_root`
### Component: c3_matcher
@@ -70,15 +72,18 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
- **Epic**: AZ-257 (E-C3 Cross-Domain Matcher)
- **Directory**: `src/gps_denied_onboard/components/c3_matcher/`
- **Public API**:
- `__init__.py` (re-exports `CrossDomainMatcher`, `MatchResult`)
- `__init__.py` (re-exports `CrossDomainMatcher`, `MatchResult`, `MatcherHealth`, `CandidateMatchSet`, `MatcherError`, `MatcherBackboneError`, `InsufficientInliersError`, `C3MatcherConfig`)
- `interface.py` (`CrossDomainMatcher` Protocol)
- `config.py` (`C3MatcherConfig`)
- `errors.py` (error hierarchy)
- **Internal**:
- `disk_lightglue.py` (DISK + LightGlue)
- `aliked_lightglue.py` (ALIKED + LightGlue)
- `xfeat.py`
- `_health_window.py` (`RollingHealthWindow` accumulator; constructor-injected into every concrete matcher)
- `disk_lightglue.py` (DISK + LightGlue, AZ-345)
- `aliked_lightglue.py` (ALIKED + LightGlue, AZ-346)
- `xfeat.py` (XFeat, AZ-347)
- `_native/`
- **Owns**: `src/gps_denied_onboard/components/c3_matcher/**`, `tests/unit/c3_matcher/**`
- **Imports from**: `_types`, `helpers.lightglue_runtime` (R14: SHARED with C2.5 — owned by helper, NOT by C3), `helpers.descriptor_normaliser`, `helpers.se3_utils`, `components.c7_inference`, `config`, `logging`, `fdr_client`
- **Owns**: `src/gps_denied_onboard/components/c3_matcher/**`, `tests/unit/c3_matcher/**`, `src/gps_denied_onboard/runtime_root/matcher_factory.py`
- **Imports from**: `_types`, `helpers.lightglue_runtime` (R14: SHARED with C2.5 — owned by helper, NOT by C3), `helpers.ransac_filter`, `helpers.descriptor_normaliser`, `helpers.se3_utils`, `components.c7_inference`, `config`, `logging`, `fdr_client`
- **Consumed by**: `c3_5_adhop`, `runtime_root`
### Component: c3_5_adhop
@@ -86,11 +91,15 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
- **Epic**: AZ-258 (E-C3.5 AdHoP Refinement)
- **Directory**: `src/gps_denied_onboard/components/c3_5_adhop/`
- **Public API**:
- `__init__.py` (re-exports `AdHoPRefinementStrategy`)
- `interface.py` (`AdHoPRefinementStrategy` Protocol)
- **Internal**: `default_refiner.py`
- **Owns**: `src/gps_denied_onboard/components/c3_5_adhop/**`, `tests/unit/c3_5_adhop/**`
- **Imports from**: `_types`, `helpers.ransac_filter`, `helpers.se3_utils`, `config`, `logging`, `fdr_client`
- `__init__.py` (re-exports `ConditionalRefiner`, `C3_5RefinerConfig`)
- `interface.py` (`ConditionalRefiner` Protocol)
- `config.py` (`C3_5RefinerConfig`)
- `errors.py` (`RefinerError`, `RefinerBackboneError`, `RefinerConfigError` — held internal to the component; consumers reach them only via tests)
- **Internal**:
- `passthrough_refiner.py` (reference baseline; AZ-348)
- `adhop_refiner.py` (production-default; AZ-349 pending)
- **Owns**: `src/gps_denied_onboard/components/c3_5_adhop/**`, `tests/unit/c3_5_adhop/**`, `src/gps_denied_onboard/runtime_root/refiner_factory.py`
- **Imports from**: `_types`, `helpers.ransac_filter` (R14: SHARED with C3 and C4 — owned by helper, NOT by C3.5), `helpers.se3_utils`, `components.c7_inference`, `config`, `logging`, `fdr_client`
- **Consumed by**: `c4_pose`, `runtime_root`
### Component: c4_pose
@@ -282,6 +291,13 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
- **Owned by**: AZ-264.
- **Consumed by**: c2_5_rerank, c3_matcher.
### shared/helpers/feature_extractor
- **Directory**: `src/gps_denied_onboard/helpers/feature_extractor.py`
- **Purpose**: Shared image → `KeypointSet` Protocol + placeholder `OpenCvOrbExtractor` impl (AZ-343 scope expansion). Lets every consumer that feeds `LightGlueRuntime.match` reach for the SAME extractor (same descriptor distribution, same `descriptor_dim`) without each strategy reinventing its own preprocessing.
- **Owned by**: AZ-343.
- **Consumed by**: c2_5_rerank (today via `InlierCountReRanker`), c3_matcher (future concrete strategies in AZ-345 / AZ-346 / AZ-347).
### shared/helpers/wgs_converter
- **Directory**: `src/gps_denied_onboard/helpers/wgs_converter.py`
@@ -32,18 +32,18 @@ This task delivers the canonical production VIO. The other two strategies (VINS-
- An `Okvis2Strategy` class at `src/gps_denied_onboard/components/c1_vio/okvis2.py` conforming to the `VioStrategy` Protocol from AZ-331; `current_strategy_label() == "okvis2"`.
- A pybind11 wrapper at `src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp` exposing the OKVIS2 C++ estimator (`okvis::ThreadedKFVio` or equivalent in the pinned upstream HEAD) to Python. The wrapper is built by CMake under `cpp/okvis2/` (build-time gated by `BUILD_OKVIS2`); the resulting `.so` is imported lazily inside `okvis2.py`.
- Constructor `__init__(self, *, calibration: CameraCalibration, preintegrator: ImuPreintegrator, fdr_client: FdrClient, logger: Logger, config: Okvis2Config)` — all dependencies constructor-injected per ADR-009. `Okvis2Config` (`@dataclass(frozen=True)`) carries the OKVIS2-specific knobs (sliding-window size K ∈ [10, 20], keyframe-decision parallax threshold, RANSAC inlier ratio, max optimisation iterations) loaded from `config.vio.okvis2.*` via AZ-269.
- Constructor `__init__(self, config: Config, *, fdr_client: FdrClient, clock: Clock | None = None)` — matches the AZ-331 composition-root factory shape (resolved 2026-05-12 against the existing factory call site `strategy_cls(config, fdr_client=fdr_client)`). Other dependencies (logger, camera calibration, IMU preintegrator substrate, OKVIS2-specific sub-config) are resolved internally from `config`. `Okvis2Config` (`@dataclass(frozen=True)`) carries the OKVIS2-specific knobs (sliding-window size K ∈ [10, 20], keyframe-decision parallax threshold, RANSAC inlier ratio, max optimisation iterations, degraded-feature threshold, per-frame debug log) loaded from `config.components.c1_vio.okvis2.*` via AZ-269 / AZ-331. `clock` defaults to `WallClock()` for live + REALTIME-replay tiers; replay-ASAP composition injects a `TlogDerivedClock` (Invariant 2 of the replay contract).
- `process_frame(frame, imu, calibration) -> VioOutput`:
1. Append IMU samples to the injected `ImuPreintegrator` (strict-monotonic guarded; `ImuPreintegrationError` rewraps to `VioFatalError`).
1. Push every IMU sample in the window into the OKVIS2 backend via `add_imu` (strict-monotonic enforced on the C++ side). OKVIS2 owns its own internal IMU integration for the VIO estimator's per-keyframe factor — the AZ-276 `ImuPreintegrator` is a *separate* substrate used by E-C5's fusion graph, NOT the input to OKVIS2's internal estimator. The "single source of IMU truth" invariant operates at the *sample-stream* level (one IMU producer), not at the integrator-instance level.
2. Feed the nav-camera frame to OKVIS2 via the pybind11 `add_frame` wrapper.
3. If OKVIS2 emits a new estimator update, extract the relative pose (SE(3) via `helpers.se3_utils`), the 6×6 covariance from OKVIS2's internal Hessian (or marginalised block per upstream API), the latest IMU bias, and the feature-quality summary (tracked / new / lost / mean parallax / per-frame MRE).
4. Build and return `VioOutput` with `frame_id` echoed.
4. Build and return `VioOutput` with `frame_id` echoed (stringified).
5. Emit per-frame DEBUG log (off by default) with backbone identity + elapsed milliseconds; emit WARN log when degraded covariance is detected (per `health_snapshot` heuristic); emit ERROR log on `VioFatalError`.
- `reset_to_warm_start(hint)`: tears down the current OKVIS2 estimator instance (releases C++ resources), constructs a fresh estimator, seeds the IMU bias from `hint.bias`, seeds the initial body-to-world pose from `hint.body_T_world`, and seeds the velocity from `hint.velocity_b`. The next `config.vio.warm_start_max_frames` frames are allowed to converge before the strategy reports `state == TRACKING` (AC-5.1). Calling `reset_to_warm_start` is idempotent across consecutive calls (the second call re-resets cleanly).
- `health_snapshot()` returns `VioHealth(state, consecutive_lost, bias_norm)` derived from OKVIS2's internal tracker state: `INIT` until enough keyframes are accumulated, `TRACKING` while the optimisation converges, `DEGRADED` when feature count drops below `config.vio.okvis2.degraded_feature_threshold` or covariance Frobenius norm exceeds 2× steady-state, `LOST` after `config.vio.lost_frame_threshold` consecutive frames without a successful update.
- The honest-covariance invariant (Protocol Invariant) is enforced behaviourally: the strategy MUST NOT shrink the reported covariance during a `DEGRADED` window (the OKVIS2 estimator's covariance is read directly; no smoothing or floor is applied that would mask degradation).
- Error envelope is closed: every OKVIS2 / pybind11 / Eigen exception is caught inside `process_frame` / `reset_to_warm_start` and rewrapped into the `VioError` family (`VioInitializingError` while INIT, `VioFatalError` on backend-init failure or sustained LOST).
- All FDR records emitted via the injected `FdrClient` use the `kind="vio.health"` schema from AZ-272; per-frame DEBUG goes to stdout/journald only (per description.md § 9 logging strategy).
- All FDR records emitted via the injected `FdrClient.enqueue(record)` use the new `kind="vio.health"` schema (added to AZ-272's `KNOWN_PAYLOAD_KEYS` by this task — payload: `state`, `consecutive_lost`, `bias_norm`, `strategy_label`, `frame_id`); per-frame DEBUG goes to stdout/journald only (per description.md § 9 logging strategy).
## Scope
@@ -74,6 +74,19 @@ This task delivers the canonical production VIO. The other two strategies (VINS-
- OKVIS2 upstream-source modifications — upstream HEAD is pinned per Plan-phase; deviations require an explicit ADR.
- Multi-camera OKVIS2 — out of scope (single nav-camera per RESTRICT-UAV-3).
## Implementation Notes (2026-05-12, batch 23)
Carry-over plan (`_docs/03_implementation/AZ-332_implementation_plan.md`) splits AZ-332 into:
1. **This batch** — production-quality Python facade (`okvis2.py`), `Okvis2Config` schema extension, FDR `vio.health` kind, full AC-1..8 + AC-10 coverage against a `FakeOkvis2Backend` fixture (`tests/unit/c1_vio/conftest.py`), pybind11 binding source that compiles + loads but throws `OkvisFatalException("estimator not yet wired")` on first `add_frame` (loud-fail, never silent), CMake glue at `cpp/okvis2/CMakeLists.txt` (gated by `BUILD_OKVIS2`).
2. **Tier-2 follow-up** — actual `okvis::ThreadedKFVio` wiring inside the binding, CI matrix that installs Ceres + initialises OKVIS2's vendored submodules, AC-9 + NFR-perf validation on Jetson against Derkachi-class fixtures. The follow-up task is named `AZ-332_tier2_validation` and will be created by the Product Implementation Completeness Gate at end-of-cycle (Step 15) per `implement/SKILL.md`. Until that lands, GitHub Actions Linux CI builds with `-DBUILD_OKVIS2=OFF` (see `.github/workflows/ci.yml` comment).
Constructor signature contradiction (task-spec vs AZ-331 factory) resolved 2026-05-12 in favour of the factory: `__init__(self, config: Config, *, fdr_client: FdrClient, clock: Clock | None = None)`. Calibration / preintegrator / logger are built internally from `config`. No churn on AZ-331's already-tested factory.
IMU-substrate contradiction (task-spec "MUST consume IMU via AZ-276 ImuPreintegrator" vs OKVIS2's internal IMU integration owned by `okvis::ThreadedKFVio`) resolved 2026-05-12: OKVIS2 owns its own IMU integration for the VIO estimator's keyframe factor; the AZ-276 preintegrator is a *separate* substrate consumed by E-C5's fusion graph. The "single source of IMU truth" invariant operates at the *sample-stream* level (one IMU producer), not at the integrator-instance level.
FDR API surface (`FdrClient.emit` in original prose) resolved to the actual public method `FdrClient.enqueue(record)`.
## Acceptance Criteria
**AC-1: `current_strategy_label()` returns `"okvis2"`**
@@ -0,0 +1,168 @@
# AZ-332 — Implementation plan (batch 23, cycle 1)
**Date created**: 2026-05-12 (carry-over from `/autodev` session 2026-05-12 morning)
**Owner**: next `/autodev` invocation starting from Step 7 Implement sub_step `compute-next-batch`
**Scope of this doc**: a concrete, in-order playbook for the next session. Reading this + the task spec at `_docs/02_tasks/todo/AZ-332_c1_okvis2_strategy.md` is sufficient to resume — no other re-discovery needed.
---
## Why this is its own plan doc
AZ-332 (C1 OKVIS2 production-default VIO) is the first task in this project to require a native C++ build chain (OKVIS2 + pybind11). The previous session researched paths, surfaced blockers, and landed on a decomposition that splits work across three build environments. That decomposition has to survive the session boundary, hence this file.
## Decisions locked in the previous session
1. **No ROS 2 layer.** `colcon` build of OKVIS2 produces the same libraries as standalone CMake plus a ROS 2 node we do not need; ROS 2 runtime IPC was rejected at Plan time (`_docs/01_solution/solution.md` § D-C1-1-SUB-A — "Rejected (cost + latency budget conflict)"). Build with **standalone CMake**.
2. **No Python re-implementation of OKVIS2.** Forbidden by the task spec ("Unacceptable substitutes" section). Pure-Python VIO violates C1-PT-01 ≤ 80 ms p95 budget by construction.
3. **No alternative VIO substitution.** Every C++ VIO candidate (OpenVINS, VINS-Mono, Kimera-VIO) has the same compile-on-macOS problem. The only Python-native candidates (DPVO, KLT+RANSAC) are mono-VO only — not drop-ins for a VIO contract. AZ-332 stays OKVIS2.
4. **Three-environment dev split**:
| Environment | What runs there | What it gates |
|---|---|---|
| macOS dev | Python facade + binding C++ editing; unit tests using the fake `_native.okvis2_binding` (task spec explicitly allows this for tests) | AC-1, AC-2, AC-3, AC-4, AC-5, AC-6, AC-7, AC-8, AC-10 |
| Ubuntu CI runner (`ci.yml`) | Native CMake build of vendored OKVIS2 + binding `.so` | Build-passes gate; no AC validation here |
| Self-hosted Jetson runner (`ci-tier2.yml`) | Real-OKVIS2 perf + honest-covariance tests | AC-9 (honest covariance monotonicity); NFR-perf p95 ≤ 80 ms |
This split honours the task spec ("real `Okvis2Strategy` calling real C7 `InferenceRuntime` with real TRT-compiled DISK engine") because the production binary IS the real binding compiled on Linux/Jetson — only the dev-side unit tests use the fake. The fake never ships to production.
## Concrete step-by-step for next session (in order; each step has a stop-and-verify gate)
### Step 0 — re-entry sanity check (1 min)
- Read `_docs/_autodev_state.md`: confirm step 7 / sub_step `compute-next-batch` / detail points here.
- Read this doc fully.
- Read `_docs/02_tasks/todo/AZ-332_c1_okvis2_strategy.md` once.
- `git status --porcelain` must be empty (implement skill prerequisite).
### Step 1 — vendor OKVIS2 and pybind11 as git submodules (510 min)
- `git submodule add --depth 1 --recurse-submodules https://github.com/smartroboticslab/okvis2.git cpp/okvis2/upstream`
- Note: submodule path is `cpp/okvis2/upstream/` (not `cpp/okvis2/` directly) so the existing `cpp/okvis2/CMakeLists.txt` keeps its project-owned role and `add_subdirectory(upstream)` pulls in OKVIS2.
- `git submodule add --depth 1 https://github.com/pybind/pybind11.git cpp/pybind11/upstream`
- Same pattern: existing `cpp/pybind11/` directory keeps the project README; submodule lives at `cpp/pybind11/upstream/`.
- Delete the `.gitkeep` and placeholder `README.md` from `cpp/pybind11/` once the submodule is in place (or keep them; they're harmless either way — pick one and stay consistent).
- Pin a known-good commit hash for OKVIS2 (record it in this doc under "Pinned upstream versions" once chosen). Recommendation: pin to the latest `main` HEAD at the time of submodule add and document the commit short-hash here.
- **Gate**: `git submodule status` shows both submodules with a SHA; `git status` clean except `.gitmodules` + submodule entries.
### Step 2 — write CMake glue (1530 min)
Files to write:
- `cpp/okvis2/CMakeLists.txt` (replace existing placeholder):
- `if(NOT BUILD_OKVIS2) return() endif()`
- `add_subdirectory(upstream EXCLUDE_FROM_ALL)` with OKVIS2's `USE_NN=OFF` to drop the LibTorch dep (per Fact #39 — keyframe arch tolerates this).
- Find_package the Linux deps OKVIS2 needs (Eigen3, Boost, glog, gflags, SuiteSparse, Ceres, OpenCV — every one is an apt package on Ubuntu, brew formula on macOS).
- `add_subdirectory(${CMAKE_SOURCE_DIR}/cpp/pybind11/upstream pybind11_build)`.
- `pybind11_add_module(okvis2_binding ${CMAKE_CURRENT_SOURCE_DIR}/../../src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp)` — note path back to Python tree.
- `target_link_libraries(okvis2_binding PRIVATE okvis::Estimator okvis::Common ...)` (exact target names from OKVIS2's CMake exports — verify by running `cmake --build build --target help | grep okvis` once submodule is in).
- `install(TARGETS okvis2_binding DESTINATION ${CMAKE_INSTALL_LIBDIR}/gps_denied_onboard/components/c1_vio/_native/)`.
- `cpp/pybind11/CMakeLists.txt` (replace existing placeholder): can stay nearly empty — pybind11 is included by `cpp/okvis2/CMakeLists.txt` via `add_subdirectory`.
The existing top-level `cpp/CMakeLists.txt` already has `add_subdirectory(okvis2)` gated on `BUILD_OKVIS2 OR BUILD_VINS_MONO OR BUILD_KLT_RANSAC` — no change needed there.
**Gate**: `cmake -S . -B build -DBUILD_OKVIS2=OFF` succeeds on macOS (no-op build with the flag off). The OFF path is what protects the rest of the build from any of this new wiring.
### Step 3 — write the pybind11 binding C++ skeleton (12 h)
File: `src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp`
Surface needed (mirrors the Python facade's needs — not the full OKVIS2 API):
- `Okvis2Backend` class with: ctor from YAML config string + camera intrinsics dict; `add_frame(frame_id: str, ns_ts: int, image: ndarray[uint8, H, W, C]) -> bool`; `add_imu(ns_ts: int, accel: ndarray[float64, 3], gyro: ndarray[float64, 3]) -> None`; `get_latest_output() -> dict | None` (returns frame_id + 4x4 pose matrix + 6x6 covariance + bias + feature_quality dict + emitted_at_ns); `reset(body_T_world: ndarray[float64, 4, 4], velocity: ndarray[float64, 3], accel_bias: ndarray[float64, 3], gyro_bias: ndarray[float64, 3]) -> None`; `health() -> dict` (returns `{state: str, consecutive_lost: int, bias_norm: float}`).
- Exceptions: every OKVIS / Eigen / std::runtime_error caught inside binding methods and rethrown as a fixed set of Python exceptions registered via `py::register_exception` — the Python facade then catches those and rewraps into `VioError` family.
- Zero-copy pathway: `image` is `py::array_t<uint8_t, py::array::c_style | py::array::forcecast>` so DISK ingest avoids a copy.
This is a skeleton — full OKVIS2 estimator wiring (`okvis::ThreadedKFVio` setup + callback plumbing) can be a follow-up commit if the skeleton + CI Linux build come back green first.
**Gate**: compiles inside the OKVIS2 CMake target. Tested on Ubuntu CI runner (not macOS).
### Step 4 — write the Python facade `okvis2.py` (12 h)
File: `src/gps_denied_onboard/components/c1_vio/okvis2.py`
- `Okvis2Strategy` class implementing the `VioStrategy` Protocol from `interface.py`.
- Lazy import of `_native.okvis2_binding` inside the module body (NOT at module top — that's the I-5 / Risk-2 mitigation; AZ-331's `test_ac5_build_vio_strategy_flag_off_no_import` asserts this and MUST still pass).
- Constructor signature: `__init__(self, config: Config, *, fdr_client: FdrClient)` — match the AZ-331 factory's call shape exactly. Inside the constructor: build the `ImuPreintegrator` from `helpers.imu_preintegrator.make_imu_preintegrator(calibration)`; build the `Okvis2Backend` from the binding; record the strategy label as `"okvis2"` (frozen per Protocol invariant).
- Map every backend exception (raised from the C++ binding's registered exception types) to the `VioError` family — `OkvisInitException → VioInitializingError`, `OkvisFatalException → VioFatalError`, `OkvisOptimizationException → VioDegradedError` (only when transitioning to fatal — the normal degraded path returns a `VioOutput` with inflated covariance per AZ-331 v1.0.0).
- `process_frame`: feed IMU samples to the preintegrator, push frame to backend, read latest output, build the `VioOutput` DTO using `gtsam.Pose3.matrix()` round-trip via `helpers.se3_utils` (AZ-277). Echo `frame_id`.
- `reset_to_warm_start`: tear down + reconstruct `Okvis2Backend` from the hint; first call must not raise (idempotency invariant per AC-4); seed bias into the preintegrator via `preintegrator.reset_with_bias(hint.bias)`.
- `health_snapshot`: pull `backend.health()` dict and wrap as `VioHealth`. Track `consecutive_lost` Python-side because the binding returns "current state" only.
- `current_strategy_label`: return the frozen `"okvis2"`.
- FDR records on state transitions via the injected `fdr_client` using the `kind="vio.health"` schema (AZ-272).
**Gate**: `mypy --strict` passes against the new file; `ruff check` passes; isinstance check `isinstance(Okvis2Strategy(...), VioStrategy)` returns True without importing the native binding (i.e., the Protocol's structural conformance, not the construction itself).
### Step 5 — write `Okvis2Config` (15 min)
File: `src/gps_denied_onboard/components/c1_vio/config.py` (extend existing — do not duplicate `C1VioConfig`).
- Add `@dataclass(frozen=True) class Okvis2Config` with fields: `keyframe_window_size: int = 15` (∈ [10, 20] per D-C5-3); `keyframe_parallax_threshold_px: float = 3.0`; `ransac_inlier_ratio: float = 0.5`; `max_optimization_iters: int = 4`; `degraded_feature_threshold: int = 30`; `per_frame_debug_log: bool = False`.
- `__post_init__` validates ranges and raises `ConfigError`.
- Register the block under `config.components['c1_vio'].okvis2` (sub-block) — keep `C1VioConfig` as-is at the top level.
**Gate**: `Okvis2Config(keyframe_window_size=9)` raises `ConfigError`; `Okvis2Config()` defaults pass.
### Step 6 — write unit tests with fake binding (12 h)
Files:
- `tests/unit/c1_vio/conftest.py`: a `fake_okvis2_binding` fixture that installs a `types.ModuleType` at `sys.modules['gps_denied_onboard.components.c1_vio._native.okvis2_binding']` with a scriptable `Okvis2Backend` test double. The test double exposes a `script()` method that pre-loads a queue of outputs / exceptions; `add_frame` pops from the queue. This is the "fake pybind11 binding that returns scripted `VioOutput` payloads" the task spec explicitly allows.
- `tests/unit/c1_vio/test_okvis2_strategy.py`: one test per AC (AC-1 through AC-8, AC-10). Use the fake binding fixture. AC-9 and the NFR-perf test are written here too but marked `@pytest.mark.tier2` so `pytest -m "not tier2"` (the macOS dev loop) skips them; `ci-tier2.yml` picks them up.
**Gate**: every unit test passes on macOS with `pytest -m "not tier2" tests/unit/c1_vio/`. Full sweep (`pytest tests/`) shows the existing 1093 passing + the new tests, with the tier2-marked ones skipped on macOS.
### Step 7 — update `.github/workflows/ci.yml` to install OKVIS2's Linux deps (510 min)
- In the `build` matrix's `deployment` and `research` kinds, add a step BEFORE `cmake -S . -B build`:
```yaml
- name: Install OKVIS2 native deps
run: |
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
libeigen3-dev libboost-all-dev libgoogle-glog-dev libgflags-dev \
libsuitesparse-dev libceres-dev libopencv-dev
```
- Toggle `BUILD_OKVIS2` to `ON` in the `deployment` kind's `cmake_flags` (default config in `solution.md` says OKVIS2 is the production-default; the deployment matrix kind should enforce this).
- The `research` kind already has `BUILD_VINS_MONO=ON`; leave `BUILD_OKVIS2=ON` there too.
**Gate**: push branch; GitHub Actions Ubuntu runner completes the `cmake --build build --parallel` step. If OKVIS2's CMake export targets have a different name than `okvis::Estimator` / `okvis::Common`, the failure surfaces here and Step 2's `target_link_libraries` is patched. This is the only build-system feedback loop we get pre-Jetson — exploit it.
### Step 8 — AC coverage verification + code review (1530 min)
- Verify every AC of AZ-332 maps to at least one test (skipped-with-reason counts as covered per implement skill Step 8).
- Invoke `/code-review` skill on the batch's changed files. Expected verdict: PASS or PASS_WITH_WARNINGS. Auto-fix or escalate per implement skill Step 10.
### Step 9 — commit (5 min)
- One commit per implement skill Step 11: `[AZ-332] C1 Okvis2Strategy: pybind11 binding skeleton + Python facade + fake-backend tests`.
- Body of commit message documents the three-environment split (macOS dev / Ubuntu CI / Jetson tier2) and notes that AC-9 + NFR-perf are tier2-gated.
### Step 10 — tracker + archive + batch report (5 min)
- Jira: AZ-332 In Progress → In Testing.
- Move `_docs/02_tasks/todo/AZ-332_c1_okvis2_strategy.md` → `_docs/02_tasks/done/`.
- Write `_docs/03_implementation/batch_23_cycle1_report.md` with the standard report shape. Include the tier2-deferred AC-9 + NFR-perf items under "Deferred to tier2 CI".
- Update `_docs/_autodev_state.md`: sub_step → next batch detection.
## Files to be created / modified (summary)
Created:
- `cpp/okvis2/upstream/` (git submodule)
- `cpp/pybind11/upstream/` (git submodule)
- `src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp`
- `src/gps_denied_onboard/components/c1_vio/okvis2.py`
- `tests/unit/c1_vio/conftest.py`
- `tests/unit/c1_vio/test_okvis2_strategy.py`
- `_docs/03_implementation/batch_23_cycle1_report.md`
Modified:
- `cpp/okvis2/CMakeLists.txt` (replace placeholder)
- `cpp/pybind11/CMakeLists.txt` (replace placeholder; can stay minimal)
- `src/gps_denied_onboard/components/c1_vio/config.py` (add `Okvis2Config`)
- `.github/workflows/ci.yml` (add apt-get step; flip `BUILD_OKVIS2=ON` in deployment kind)
- `.gitmodules` (auto-edited by submodule add)
- `_docs/_autodev_state.md`
- `_docs/02_tasks/todo/AZ-332_c1_okvis2_strategy.md` (moved to done/)
## Tier2 deliverables (NOT this session — explicit follow-up)
AC-9 (honest covariance monotonicity) and the NFR-perf test (`process_frame` p95 ≤ 80 ms on Tier-2) require real OKVIS2 + Derkachi-class fixture footage on the actual Jetson hardware. They are:
- Written in `test_okvis2_strategy.py` marked `@pytest.mark.tier2`.
- Skipped on macOS dev + GitHub Actions Linux runner.
- Picked up by `ci-tier2.yml` on push to `stage` or `main`.
- A remediation task (`AZ-332_tier2_validation`) is OPTIONAL — could be tracked separately or rolled into the deferred Jetson MVE phase that D-C1-2 already scheduled. Pick at session-start time.
## Pinned upstream versions
Fill in once Step 1 is executed:
- `cpp/okvis2/upstream` — commit hash: _TBD_; OKVIS2 main branch HEAD at `<date>`
- `cpp/pybind11/upstream` — commit hash: _TBD_; pybind11 stable release tag `<version>`
## When this doc can be deleted
After AZ-332 lands and the next batch is in flight, this file is historical context. Move to `_docs/_archive/` (or delete if `_archive` doesn't exist) once Jetson tier2 CI has been green at least once on a real OKVIS2 run.
@@ -0,0 +1,99 @@
# Batch 23 — Cycle 1 — Implementation Report
**Batch**: 23/cycle1
**Date**: 2026-05-12
**Context**: Product implementation (greenfield Step 7)
**Tasks**: `AZ-332` (C1 OKVIS2 Strategy — Production-Default VIO)
## Task Outcomes
### AZ-332 — C1 OKVIS2 Strategy
**Status**: Implemented (Python facade + binding skeleton); see *Known Gaps* below — Step 15 Product Implementation Completeness Gate is expected to flag this for a tier-2 follow-up before the cycle-end report can be written.
**Files added**:
- `src/gps_denied_onboard/components/c1_vio/okvis2.py``Okvis2Strategy` Python facade conforming to AZ-331's `VioStrategy` Protocol (production-quality state machine, error envelope, FDR emission, Clock injection per Invariant 2).
- `src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp` — pybind11 binding source: compiles + loads, throws `OkvisFatalException("estimator not yet wired")` on first `add_frame` (loud-fail, never silent).
- `src/gps_denied_onboard/components/c1_vio/bench/{__init__.py, okvis2.py}` — C1-PT-01 microbench harness.
- `tests/unit/c1_vio/conftest.py` — scriptable `FakeOkvis2Backend` installed at `sys.modules['gps_denied_onboard.components.c1_vio._native.okvis2_binding']` before lazy import.
- `tests/unit/c1_vio/test_okvis2_strategy.py` — 17 tests covering AC-1..10 (with AC-9 + NFR-perf marked `@pytest.mark.tier2`).
**Files modified**:
- `src/gps_denied_onboard/components/c1_vio/config.py` — added `Okvis2Config` sub-block (`keyframe_window_size ∈ [10,20]`, parallax / RANSAC inlier / max-iters / degraded-feature-threshold / per-frame-debug-log).
- `src/gps_denied_onboard/components/c1_vio/__init__.py` — re-export `Okvis2Config`.
- `src/gps_denied_onboard/fdr_client/records.py` — added `vio.health` kind to `KNOWN_PAYLOAD_KEYS` (payload: `state`, `consecutive_lost`, `bias_norm`, `strategy_label`, `frame_id`).
- `cpp/okvis2/CMakeLists.txt` — real glue (gated by `BUILD_OKVIS2`); links `okvis_ceres / okvis_frontend / okvis_multisensor_processing / okvis_kinematics / okvis_cv / okvis_common / okvis_time / okvis_util`; uses system-installed Ceres / BRISK / DBoW2.
- `.github/workflows/ci.yml` — temporarily forces `-DBUILD_OKVIS2=OFF` in both `deployment` and `research` matrix entries; comment links the decision to the tier-2 follow-up.
- `tests/unit/c1_vio/test_protocol_conformance.py``test_ac5_flag_on_but_module_missing` parameterised: `vins_mono`/`klt_ransac` still expect `StrategyNotAvailableError` (modules not yet implemented); `okvis2` now expects `VioFatalError("native binding ...")` because the strategy module IS present but the C++ binding isn't.
- `tests/unit/test_az272_fdr_record_schema.py` — added `vio.health` payload fixture so the AC-1 roundtrip test covers the new kind.
- `_docs/02_tasks/todo/AZ-332_c1_okvis2_strategy.md``Implementation Notes (2026-05-12, batch 23)` section added with the three resolved contradictions (constructor signature, IMU substrate ownership, FDR `enqueue` vs prose `emit`).
**Submodules added**: `cpp/pybind11/upstream` (vendored pybind11), `cpp/okvis2/upstream` (vendored OKVIS2). Recursive submodule init is intentionally deferred — CI builds with `BUILD_OKVIS2=OFF` and dev macOS does not need OKVIS2's internal submodules.
## AC Coverage Verification
| AC | Test | Path |
|---------|------|------|
| AC-1 | `test_ac1_current_strategy_label_returns_okvis2` | ✓ Covered |
| AC-2 | `test_ac2_process_frame_returns_vio_output_with_frame_id` | ✓ Covered |
| AC-3 | `test_ac3_backend_exceptions_rewrap_to_vio_error_family` (+ 2 siblings) | ✓ Covered |
| AC-4 | `test_ac4_reset_to_warm_start_clears_and_seeds` + `_is_idempotent` | ✓ Covered |
| AC-5 | `test_ac5_health_snapshot_init_then_tracking` | ✓ Covered |
| AC-6 | `test_ac6_degraded_on_feature_loss_emits_vio_output` | ✓ Covered |
| AC-7 | `test_ac7_sustained_loss_raises_vio_fatal_error` | ✓ Covered |
| AC-8 | `test_ac8_strategy_module_not_imported_at_package_load` (+ `test_ac5_build_vio_strategy_flag_off_no_import` in protocol_conformance.py) | ✓ Covered |
| AC-9 | `test_ac9_honest_covariance_monotonic_during_degraded` `@tier2` | ✓ Covered (tier2) |
| AC-10 | `test_ac10_fdr_vio_health_emitted_per_transition` | ✓ Covered |
| NFR-perf | `test_nfr_perf_process_frame_p95_under_80ms` `@tier2` | ✓ Covered (tier2) |
Plus 2 construction guards (`test_construct_with_wrong_strategy_label_raises`, `test_build_via_factory_returns_okvis2_strategy`) — 17 tests total. **All ACs covered.**
## Test Run
- **Targeted**: `pytest tests/unit/c1_vio/test_okvis2_strategy.py -m "not tier2"`**15 passed**, 2 deselected (tier2).
- **Full Tier-1 suite** (`pytest -m "not tier2"`): **1109 passed**, 2 skipped (env: `cmake` / `actionlint` not on local PATH; CI installs both), 2 deselected (tier2). No regressions.
## Code Review
Self-review verdict: **PASS** (no critical / no high findings).
Notes from review:
- `Okvis2Strategy._classify_state` warm-start arithmetic verified by trace against `warm_start_max_frames` ∈ {1, 3, 5}; AC-5 default-5 produces TRACKING on the 5th successful call.
- `_emit_transition` is idempotent under repeated identical states — `_last_emitted_state` guard prevents steady-state FDR spam (AC-10 invariant).
- `_tick_lost` keeps state at `INIT` through opt-exception runs until `lost_frame_threshold` trips, matching AC-7 trace.
- Native binding catches every Eigen / `std::runtime_error` and rewraps into one of three registered Python-side exception types; the Python facade further rewraps into the `VioError` family with `__cause__` chains preserved (AC-3).
- `Clock` injection follows the c13_fdr/writer.py pattern (optional kwarg, defaults to `WallClock()`); composition-root replay binding will inject `TlogDerivedClock` separately. No direct `time.monotonic_ns` / `time.time_ns` / `time.sleep` calls in any new `components/` source.
## Known Gaps (for Step 15 Product Implementation Completeness Gate)
The AZ-332 task spec promises a fully wired OKVIS2 estimator (real `okvis::ThreadedKFVio` callbacks producing pose + covariance for the C5 fusion graph). This batch ships:
- **PASS**: Python facade with full production state machine + error envelope + FDR emission.
- **FAIL**: C++ binding wires the API surface but throws `OkvisFatalException("estimator not yet wired")` on first `add_frame`. The actual `okvis::ThreadedKFVio` setup + callback plumbing + Hessian-block extraction is not implemented.
- **FAIL**: GitHub Actions Linux CI compiles with `BUILD_OKVIS2=OFF`; the OKVIS2 native build path is not exercised in any pipeline.
- **PASS (tier2)**: AC-9 (covariance Frobenius monotonicity under DEGRADED) + NFR-perf (p95 ≤ 80 ms on Jetson) — Tier-2 / Jetson-only; will run on real OKVIS2 once estimator wiring lands.
The Step 15 gate is expected to classify AZ-332 as **FAIL** and require a `remediate_AZ-332_tier2_validation` task that:
1. Wires `okvis::ThreadedKFVio` (or upstream-equivalent) inside `okvis2_binding.cpp`.
2. Adds Ceres / SuiteSparse / OpenCV apt-installs + recursive submodule checkout to the Linux CI build.
3. Sets `-DBUILD_OKVIS2=ON` in the Linux deployment matrix.
4. Validates AC-9 + NFR-perf on Tier-2 Jetson hardware against a Derkachi-class fixture.
This is **NOT** a hidden gap — it is recorded here, in the AZ-332 spec's *Implementation Notes* section, and in the CI yaml comment block.
## Cumulative Review Trigger
Last cumulative review covered batches 0122. K = 3 → next trigger fires at batch 25. **No cumulative review for this batch.**
## Auto-Fix Attempts / Escalations
- **Auto-fixes**: 16 ruff lint findings auto-fixed (unused imports, B905 zip strict, RUF007 itertools.pairwise, RUF022 __all__ sorting, I001 import order). Format applied via `ruff format` (7 files reformatted).
- **Escalations**: none.
## Open Blockers
- None for this batch. The tier-2 wiring task is a deferred follow-up, not a blocker on this batch's commit.
+3 -3
View File
@@ -6,9 +6,9 @@ step: 7
name: Implement
status: in_progress
sub_step:
phase: 14
name: cumulative-code-review
detail: "PASS after F1+F2 remediation in-session; F3 informational; ready for batch 23"
phase: 13
name: archive-and-loop
detail: "batch 23/cycle1 complete: AZ-332 → In Testing, archived to done/. Next: recompute batch 24 (AZ-345 still gated; product-tasks queue may be near-empty — Step 15 Product Implementation Completeness Gate is the expected next stop)."
retry_count: 0
cycle: 1
tracker: jira
+79 -4
View File
@@ -1,9 +1,84 @@
# OKVIS2 native wrapper — placeholder.
# cpp/okvis2/CMakeLists.txt — OKVIS2 wrapper for C1 VIO (AZ-332).
#
# Owned by C1 VIO (AZ-332). Bootstrap ships an empty subproject so CMake parses
# top-level when BUILD_OKVIS2=ON.
# Builds the vendored OKVIS2 upstream (cpp/okvis2/upstream/, git submodule)
# plus a pybind11 binding that exposes the estimator to the Python facade
# at src/gps_denied_onboard/components/c1_vio/okvis2.py.
#
# Gating: BUILD_OKVIS2=ON only on linux production binaries (deployment +
# research matrix kinds in .github/workflows/ci.yml). macOS dev builds
# default BUILD_OKVIS2=OFF; unit tests use a fake pybind11 binding fixture
# installed at sys.modules boundary (tests/unit/c1_vio/conftest.py).
#
# Bundled OKVIS2 deps (DBoW2, brisk, ceres-solver, opengv) are NOT pulled
# into this clone — see ci.yml step that installs them via apt
# (libceres-dev libsuitesparse-dev etc.) and the USE_SYSTEM_* flags below.
if(NOT BUILD_OKVIS2)
return()
endif()
message(STATUS "[okvis2] Placeholder; concrete sources land with AZ-332.")
message(STATUS "[okvis2] BUILD_OKVIS2=ON — building OKVIS2 upstream + pybind11 binding")
# Tell OKVIS2 to use system-installed dependencies instead of its bundled
# external/ submodules (which we do not initialise — saves ~hundreds of MB
# and matches the Linux apt-deps approach in ci.yml).
set(USE_SYSTEM_BRISK ON CACHE BOOL "AZ-332: use apt libbrisk-dev" FORCE)
set(USE_SYSTEM_DBOW2 ON CACHE BOOL "AZ-332: use apt libdbow2-dev" FORCE)
set(USE_SYSTEM_CERES ON CACHE BOOL "AZ-332: use apt libceres-dev" FORCE)
# Trim OKVIS2's build surface — we link the estimator libs only.
set(BUILD_APPS OFF CACHE BOOL "AZ-332: skip OKVIS2 demo apps" FORCE)
set(BUILD_TESTS OFF CACHE BOOL "AZ-332: skip OKVIS2 gtests" FORCE)
set(BUILD_ROS2 OFF CACHE BOOL "AZ-332: ROS 2 rejected at Plan time (D-C1-1-SUB-A)" FORCE)
set(HAVE_LIBREALSENSE OFF CACHE BOOL "AZ-332: no realsense pipeline" FORCE)
set(USE_NN OFF CACHE BOOL "AZ-332: drop LibTorch dep (keyframe arch OK per Fact #39)" FORCE)
set(DO_TIMING OFF CACHE BOOL "AZ-332: disable per-frame timing prints" FORCE)
set(BUILD_SHARED_LIBS OFF CACHE BOOL "AZ-332: link OKVIS as static into the .so" FORCE)
# pybind11 (vendored at cpp/pybind11/upstream/) — guarded so a sibling
# native binding (gtsam_bindings, faiss_index) cannot double-add the
# subdirectory.
if(NOT TARGET pybind11::module)
add_subdirectory(
${CMAKE_SOURCE_DIR}/cpp/pybind11/upstream
${CMAKE_BINARY_DIR}/pybind11_build
)
endif()
# Vendored OKVIS2 upstream — EXCLUDE_FROM_ALL keeps unused targets out of
# the default build graph; we depend on the okvis_* libs we explicitly
# link below.
add_subdirectory(upstream EXCLUDE_FROM_ALL)
# pybind11 binding source — per module-layout.md rule #4 the binding code
# lives next to the Python facade, not under cpp/.
set(OKVIS2_BINDING_SRC
${CMAKE_SOURCE_DIR}/src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp
)
pybind11_add_module(okvis2_binding ${OKVIS2_BINDING_SRC})
# OKVIS2 export targets — exact list confirmed by walking upstream
# CMakeLists in cpp/okvis2/upstream/okvis_*/. If a target name changes
# upstream, the linker error on first CI run pinpoints which one.
target_link_libraries(okvis2_binding
PRIVATE
okvis_ceres
okvis_frontend
okvis_multisensor_processing
okvis_kinematics
okvis_cv
okvis_common
okvis_time
okvis_util
)
target_compile_features(okvis2_binding PRIVATE cxx_std_17)
# Install the .so next to the Python facade so the lazy import inside
# okvis2.py (`from . import _native; _native.okvis2_binding`) resolves at
# runtime without a sys.path shim.
install(TARGETS okvis2_binding
LIBRARY DESTINATION
${CMAKE_INSTALL_LIBDIR}/gps_denied_onboard/components/c1_vio/_native/
)
Submodule cpp/okvis2/upstream added at a2ea00688c
+113
View File
@@ -0,0 +1,113 @@
"""C3 cross-domain matcher DTOs (L1 cross-component layer; AZ-344).
Frozen by ``contracts/c3_matcher/cross_domain_matcher_protocol.md``
v1.0.0: three slotted, immutable dataclasses
(:class:`CandidateMatchSet`, :class:`MatchResult`,
:class:`MatcherHealth`).
:class:`CandidateMatchSet.tile_id` is a plain
``tuple[int, float, float]`` of ``(zoom_level, lat, lon)`` —
identical encoding to :class:`VprCandidate.tile_id` /
:class:`RerankCandidate.tile_id` — keeping the L1 layer free of an
L1→L3 import per ``module-layout.md`` (consumers reconstruct
:class:`gps_denied_onboard.components.c6_tile_cache.TileId` at the
C6 boundary).
:class:`MatchResult.per_candidate` is a ``tuple`` (not a ``list``)
so the ``frozen=True, slots=True`` invariant truly holds — a frozen
dataclass holding a mutable list lets consumers mutate it; the
tuple closes that door.
"""
from __future__ import annotations
from dataclasses import dataclass
import numpy as np
__all__ = ["CandidateMatchSet", "MatchResult", "MatcherHealth"]
@dataclass(frozen=True, slots=True)
class CandidateMatchSet:
"""Per-candidate matching outcome inside a :class:`MatchResult`.
``inlier_correspondences`` is shape ``(I, 4)`` ``float32`` with
columns ``(px_query, py_query, px_tile, py_tile)``; rows are
RANSAC inliers only so ``I == inlier_count``.
``per_candidate_residual_px`` is the MEDIAN reprojection
residual on inliers — not the mean, not the max. C3.5's
threshold gate compares against this value (INV-8).
"""
tile_id: tuple[int, float, float]
inlier_count: int
inlier_correspondences: np.ndarray
ransac_outlier_count: int
per_candidate_residual_px: float
@dataclass(frozen=True, slots=True)
class MatchResult:
"""Cross-domain match outcome for one frame.
Consumed by C3.5 :class:`ConditionalRefiner` (AZ-348) — which
may either pass the result through unchanged or emit a new
instance via :func:`dataclasses.replace` with the refinement
fields set.
The ``per_candidate`` tuple is sorted descending by
``inlier_count`` with ties broken ascending by
``per_candidate_residual_px`` (INV-3) so
``best_candidate_idx == 0`` by construction.
``reprojection_residual_px`` is the best candidate's median
residual (mirrors ``per_candidate[0].per_candidate_residual_px``)
surfaced separately so consumers do not have to know the
ranking encoding.
The two refinement fields are populated by C3.5; they default
to the passthrough values so a C3 producer that never goes
through C3.5 still yields a valid downstream-readable
:class:`MatchResult` (AZ-348 AC-2). ``refinement_label`` is
one of ``"adhop"`` or ``"passthrough"``;
``refinement_added_latency_ms`` covers exactly the work done
inside :meth:`ConditionalRefiner.refine_if_needed` and is
``0.0`` on pure passthrough.
"""
frame_id: int
per_candidate: tuple[CandidateMatchSet, ...]
best_candidate_idx: int
reprojection_residual_px: float
matched_at: int
matcher_label: str
candidates_input: int
candidates_dropped: int
refinement_label: str = "passthrough"
refinement_added_latency_ms: float = 0.0
@dataclass(frozen=True, slots=True)
class MatcherHealth:
"""Rolling-window matcher health snapshot.
Produced by :meth:`CrossDomainMatcher.health_snapshot`. Drives
C5's spoof-promotion gate (AC-NEW-2 / AC-NEW-7) and post-flight
forensics.
``consecutive_low_inlier`` is the count of CONSECUTIVE recent
frames whose best-candidate inlier count fell below the
configured ``min_inliers_threshold`` floor; it resets to zero on
any frame whose inlier count meets or exceeds the floor
(INV-12).
``mean_inliers_60s`` is the rolling 60 s mean of best-candidate
inlier counts. ``backbone_error_count_60s`` counts per-candidate
:class:`MatcherBackboneError` occurrences in the same window.
"""
consecutive_low_inlier: int
mean_inliers_60s: float
backbone_error_count_60s: int
+9 -14
View File
@@ -1,25 +1,20 @@
"""C3 / shared cross-domain matching DTOs."""
"""Shared LightGlue-runtime DTOs (L1 helper layer; AZ-278).
The cross-component ``MatchResult`` DTO previously lived here under
a placeholder schema; AZ-344 froze the real shape in
:mod:`gps_denied_onboard._types.matcher`. The two surviving classes
(:class:`KeypointSet`, :class:`CorrespondenceSet`) are the L1
helper substrate for :class:`gps_denied_onboard.helpers.lightglue_runtime.LightGlueRuntime`
and stay here.
"""
from __future__ import annotations
from dataclasses import dataclass
from typing import Any
import numpy as np
@dataclass(frozen=True)
class MatchResult:
"""Output of the cross-domain matcher (frame ↔ satellite tile)."""
query_frame_id: int
tile_id: str
keypoints_query: Any
keypoints_tile: Any
matches: Any
inlier_mask: Any | None = None
@dataclass(frozen=True)
class KeypointSet:
"""A backbone-extracted keypoint + descriptor bundle.
+111 -6
View File
@@ -1,16 +1,25 @@
"""Navigation-side DTOs: camera frames, IMU samples, attitude, FC flight state, GPS health.
"""Navigation-side DTOs: camera frames, IMU samples, attitude, VIO output.
These are type-only stubs created by AZ-263 (Bootstrap). Concrete field semantics are
defined in `_docs/02_document/architecture.md § 4` and the C1 / C5 / C8 component
specs. Concrete subclasses are owned by the components that emit them; downstream
consumers depend on the DTOs declared here.
Type-only stubs created by AZ-263 (Bootstrap) and extended by AZ-331
(C1 VioStrategy contract freeze, v1.0.0). Concrete field semantics
are defined in ``_docs/02_document/architecture.md § 4`` and the C1
/ C5 / C8 component specs. Concrete subclasses are owned by the
components that emit them; downstream consumers depend on the DTOs
declared here. Cross-component DTOs (``VioOutput``, ``VioHealth``)
live at this L1 layer per module-layout.md — components MUST NOT
import them from other components.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
import numpy.typing as npt
from gps_denied_onboard.helpers.se3_utils import SE3
@dataclass(frozen=True)
@@ -75,3 +84,99 @@ class AttitudeWindow:
# canonical shape uses enums + monotonic_ns timestamps; the old stubs
# from AZ-263 used `str` + `datetime` and were never wired by any
# production producer or consumer).
# ----------------------------------------------------------------------
# AZ-331 — C1 VioStrategy contract DTOs (v1.0.0).
from enum import Enum
class VioState(str, Enum):
"""C1 VIO health state reported on every ``VioHealth`` snapshot."""
INIT = "init"
TRACKING = "tracking"
DEGRADED = "degraded"
LOST = "lost"
@dataclass(frozen=True)
class FeatureQuality:
"""Per-frame feature-tracking diagnostics (C1 contract v1.0.0).
Surfaced inside :class:`VioOutput`; consumed by C13 FDR + C5 fusion
for adaptive-gating decisions.
"""
tracked: int
new: int
lost: int
mean_parallax: float
mre_px: float
@dataclass(frozen=True)
class WarmStartPose:
"""Hint passed to ``VioStrategy.reset_to_warm_start`` after F8 reboot.
``body_T_world`` is a :class:`gtsam.Pose3` (= ``SE3``); the warm-start
+ F8 reboot recovery wiring task in E-C1 owns the on-disk persistence
pattern. ``captured_at_ns`` is :func:`time.monotonic_ns` at hint
production time.
"""
body_T_world: "SE3"
velocity_b: tuple[float, float, float]
bias: "ImuBias"
captured_at_ns: int
@dataclass(frozen=True)
class VioOutput:
"""C1 strategy per-frame output (C1 contract v1.0.0).
``frame_id`` is a stable string identifier echoed from the input
``NavCameraFrame.frame_id`` (stringified if the source is an int
or UUID); C5 uses it as a hashable key into its frame→GTSAM-key
map (Invariant — alignment).
``relative_pose_T`` is the strategy's current pose as a
:class:`gtsam.Pose3`; expressed in the strategy's own internal
frame (VIO has no absolute world reference — the internal frame
drifts from world over time). C5 computes the between-factor
delta via ``prev.between(curr)`` and inserts ``curr`` as the
iSAM2 Values entry.
``pose_covariance_6x6`` is the symmetric positive-definite 6×6
covariance in the rotation-then-translation tangent-space order;
strategies MUST NOT tighten this during a degradation event
(Invariant — honest covariance is the AC-NEW-4 / AC-NEW-7 floor).
``imu_bias`` is the strategy's latest bias estimate;
``feature_quality`` is per-frame tracker diagnostics;
``emitted_at_ns`` is :func:`time.monotonic_ns` at output time.
"""
frame_id: str
relative_pose_T: "SE3"
pose_covariance_6x6: "npt.NDArray[Any]"
imu_bias: "ImuBias"
feature_quality: FeatureQuality
emitted_at_ns: int
@dataclass(frozen=True)
class VioHealth:
"""C1 strategy health snapshot (C1 contract v1.0.0).
Returned from :meth:`VioStrategy.health_snapshot`; consumed by C13
FDR and the composition-root watchdog. ``consecutive_lost`` ticks
every time ``state == LOST``; the strategy raises
:class:`VioFatalError` once this exceeds
``config.vio.lost_frame_threshold`` (default 9).
"""
state: VioState
consecutive_lost: int
bias_norm: float
+67
View File
@@ -0,0 +1,67 @@
"""C2.5 rerank DTOs (L1 cross-component layer; AZ-342).
The two-DTO surface is frozen by
``contracts/c2_5_rerank/rerank_strategy_protocol.md`` v1.0.0:
slotted, immutable, ``produced_at`` stamped with the producer's
``monotonic_ns`` so the C13 FDR record can correlate without a
wall-clock dependency.
:class:`RerankCandidate.tile_id` is a plain
``tuple[int, float, float]`` of ``(zoom_level, lat, lon)`` —
identical encoding to :class:`VprCandidate.tile_id` — keeping the L1
layer free of an L1→L3 import per ``module-layout.md`` (consumers
reconstruct :class:`gps_denied_onboard.components.c6_tile_cache.TileId`
at the C6 boundary).
:class:`RerankCandidate.tile_pixels_handle` is intentionally typed
``object``: C6 owns the actual handle type and the rerank Protocol
treats it as opaque per Invariant 6 (the handle is a reference, NOT
a copy — copying tile pixels would defeat AC-4.1's latency budget).
"""
from __future__ import annotations
from dataclasses import dataclass
__all__ = ["RerankCandidate", "RerankResult"]
@dataclass(frozen=True, slots=True)
class RerankCandidate:
"""One re-rank survivor.
Carries the C2-stage ``descriptor_distance`` + ``descriptor_dim``
forward unchanged (INV-5) so the FDR record retains the full
provenance chain. ``inlier_count`` is the new field produced by
the single-pair LightGlue forward at re-rank time; ``> 0`` for
every survivor.
"""
tile_id: tuple[int, float, float]
inlier_count: int
descriptor_distance: float
descriptor_dim: int
tile_pixels_handle: object
@dataclass(frozen=True, slots=True)
class RerankResult:
"""Top-N survivors from :meth:`ReRankStrategy.rerank`.
Consumed by C3 CrossDomainMatcher. ``candidates`` is a tuple
(not a list) so the frozen+slots invariant truly holds — a frozen
dataclass holding a mutable list lets consumers mutate it; the
tuple closes that door.
``candidates_input`` / ``candidates_dropped`` make the
drop-and-continue accounting (INV-8) observable per-frame so a
post-flight aggregate alert can flag flights whose
``candidates_dropped`` p95 climbs.
"""
frame_id: int
candidates: tuple[RerankCandidate, ...]
reranked_at: int
rerank_label: str
candidates_input: int
candidates_dropped: int
-21
View File
@@ -1,21 +0,0 @@
"""C1 VIO output DTO."""
from __future__ import annotations
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any
@dataclass(frozen=True)
class VioOutput:
"""VIO pose + uncertainty + health bundle.
Concrete semantics in `_docs/02_document/components/01_c1_vio/description.md § 2`.
"""
frame_id: int
timestamp: datetime
pose_se3: Any
covariance_6x6: Any | None = None
health_flags: dict[str, Any] = field(default_factory=dict)
+73 -21
View File
@@ -1,35 +1,87 @@
"""C2 VPR + C2.5 rerank DTOs."""
"""C2 VPR DTOs (L1 cross-component layer; AZ-336).
The trio (:class:`VprQuery`, :class:`VprCandidate`, :class:`VprResult`)
is frozen by ``contracts/c2_vpr/vpr_strategy_protocol.md`` v1.0.0:
slotted, immutable, no defaults, and stamped with the producer's
``monotonic_ns`` so the C13 FDR record can correlate the embed→retrieve
hop without a wall-clock dependency.
C2.5 rerank DTOs live in :mod:`gps_denied_onboard._types.rerank` (AZ-342);
this module no longer re-exports them.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any
from dataclasses import dataclass
__all__ = [
"VprCandidate",
"VprQuery",
"VprResult",
]
@dataclass(frozen=True)
@dataclass(frozen=True, slots=True)
class VprQuery:
"""A VPR query (global descriptor + frame metadata)."""
"""Backbone embedding for one nav-camera frame.
Produced by :meth:`VprStrategy.embed_query`; consumed by
:meth:`VprStrategy.retrieve_topk` on the same instance (same
ingest thread per INV-1) or — in the offline C10 corpus-build
path — by the descriptor index builder.
``frame_id`` echoes :attr:`NavCameraFrame.frame_id` (int); the
contract documented ``UUID`` in draft but every other in-pipeline
DTO routed from a single frame carries the source ``int`` so
consumers can correlate without an extra mapping step.
``embedding`` is L2-normalised by the strategy before return
(INV-3); ``produced_at`` is ``monotonic_ns`` from an injected
:class:`gps_denied_onboard.clock.Clock`.
"""
frame_id: int
timestamp: datetime
global_descriptor: Any
metadata: dict[str, Any] = field(default_factory=dict)
embedding: object # numpy.ndarray, shape (D,), dtype float16|float32
produced_at: int
@dataclass(frozen=True)
@dataclass(frozen=True, slots=True)
class VprCandidate:
"""One retrieval candidate from a top-K result.
``tile_id`` is the composite ``(zoom_level, lat, lon)`` tuple
matching :class:`gps_denied_onboard.components.c6_tile_cache.TileId`;
the tuple form keeps this L1 DTO free of an L3 component import
(the contract spells it ``tuple`` per the architecture layering
rule). Consumers reconstruct ``TileId`` at the C6 boundary.
``descriptor_distance`` is the backbone-specific metric (cosine
on L2-normalised embeddings, Euclidean on raw); the strategy
sorts the parent ``VprResult.candidates`` tuple ascending by
this field (INV-4). ``descriptor_dim`` is echoed from the
strategy so a downstream FDR audit can verify the strategy↔index
dim match after the flight.
"""
tile_id: tuple[int, float, float]
descriptor_distance: float
descriptor_dim: int
@dataclass(frozen=True, slots=True)
class VprResult:
"""Top-K candidates from C2 retrieval."""
"""Top-K candidates from :meth:`VprStrategy.retrieve_topk`.
query_frame_id: int
candidate_tile_ids: tuple[str, ...]
scores: tuple[float, ...]
Consumed by C2.5 ``RerankStrategy``. ``candidates`` is a tuple
(not a list) so the frozen+slotted invariant holds: a frozen
dataclass can still receive a mutable list and let consumers
mutate it; the tuple closes that door.
``backbone_label`` is the strategy's lowercase
``BUILD_VPR_<variant>`` form (INV-5) — non-empty for every
successful retrieval; consumed by C13 FDR for provenance.
"""
@dataclass(frozen=True)
class RerankResult:
"""C2.5 reranked set of candidate tiles."""
query_frame_id: int
candidate_tile_ids: tuple[str, ...]
inlier_counts: tuple[int, ...]
frame_id: int
candidates: tuple[VprCandidate, ...]
retrieved_at: int
backbone_label: str
+8 -3
View File
@@ -1,7 +1,12 @@
"""Clock interface + concrete implementations.
"""``Clock`` Protocol — public surface (AZ-398 v1.0.0).
The interface is bootstrap-stubbed here. `WallClock` (live) and `TlogDerivedClock`
(replay) are owned by AZ-401 (E-DEMO-REPLAY).
Re-exports the Protocol only; concrete strategies (``WallClock``,
``TlogDerivedClock``) are NOT exported via ``__all__`` per AC-9 —
composition-root code imports them from their concrete module paths
so the lazy-import boundary stays explicit.
Components import :class:`Clock` and accept it via constructor
injection (Invariant 2).
"""
from gps_denied_onboard.clock.interface import Clock
+59 -9
View File
@@ -1,20 +1,70 @@
"""`Clock` Protocol.
"""``Clock`` Protocol — replay/live-agnostic monotonic + wall-clock time.
R-DEMO-4: production C1-C5 paths bake real-time-cadence assumptions; injected
Clock lets replay mode trip those timers consistently against tlog timestamps.
Frozen at AZ-398 v1.0.0 per the replay contract:
``_docs/02_document/contracts/replay/replay_protocol.md``.
Owned by AZ-401. Bootstrap ships the interface stub.
The Protocol is Layer 1 cross-cutting per ``module-layout.md`` — every
component that previously called :func:`time.monotonic_ns`,
:func:`time.time_ns`, or :func:`time.sleep` MUST consume an injected
:class:`Clock` instead (Invariant 2). The strategy is selected exactly
once at composition time (Invariant — Single Clock per process):
- **Live / research / operator** binaries inject :class:`WallClock`.
- **Replay** binary injects :class:`TlogDerivedClock` (ASAP) or
:class:`WallClock` (REALTIME pace).
Mode-specific behaviour lives in the strategy; consumers see only the
``Clock`` interface (R-DEMO-4 mitigation).
"""
from __future__ import annotations
from datetime import datetime
from typing import Protocol
from typing import Protocol, runtime_checkable
@runtime_checkable
class Clock(Protocol):
"""A monotonic clock abstraction."""
"""Monotonic + wall-clock + sleep-until abstraction (AZ-398 v1.0.0).
def now(self) -> datetime: ...
All three methods are non-blocking except :meth:`sleep_until_ns`,
which honours the configured replay pace:
def monotonic(self) -> float: ...
- ``WallClock.sleep_until_ns(t)`` blocks until ``time.monotonic_ns()``
catches up to ``t`` (live + REALTIME replay).
- ``TlogDerivedClock.sleep_until_ns(t)`` is a no-op (ASAP replay).
Strategies MUST guarantee :meth:`monotonic_ns` is non-decreasing
across calls within the same process (Invariant 3 spirit).
"""
def monotonic_ns(self) -> int:
"""Return the strategy's monotonic time in nanoseconds.
For :class:`WallClock` this delegates to
:func:`time.monotonic_ns`. For :class:`TlogDerivedClock` this
returns the most recently advanced tlog timestamp (advance-on-
call semantics — see AC-6).
"""
...
def time_ns(self) -> int:
"""Return the strategy's UTC wall-clock time in nanoseconds.
Used for log timestamps that need calendar alignment (FDR
records, STATUSTEXT). For :class:`WallClock` this is
:func:`time.time_ns`; for :class:`TlogDerivedClock` this is the
tlog message's wall-clock timestamp (the ``time_unix_usec`` /
``time_boot_ms`` field, normalised to ns).
"""
...
def sleep_until_ns(self, target_ns: int) -> None:
"""Block until :meth:`monotonic_ns` would return ``target_ns``.
Honours ``pace=REALTIME`` by sleeping the wall-clock delta; honours
``pace=ASAP`` by no-op'ing. ``target_ns`` already in the past is a
no-op (no exception, no negative sleep). The Protocol does not
prescribe spurious-wakeup behaviour; strategies SHOULD use
:func:`time.sleep` (which retries internally on POSIX).
"""
...
@@ -0,0 +1,92 @@
"""``TlogDerivedClock`` strategy (AZ-398) — replay-only.
Advances ``monotonic_ns`` on each call to the next timestamp emitted by
the wrapped tlog-timestamp source. Out-of-order timestamps raise
:class:`ClockOrderingError` (AC-6) — replay determinism is hard-failed,
never silently smoothed.
The strategy is constructed by the replay composition root (AZ-401)
with a callable that yields tlog timestamps as the parser advances.
For unit tests, an iterator of pre-known timestamps suffices.
"""
from __future__ import annotations
from collections.abc import Callable, Iterable, Iterator
class ClockOrderingError(RuntimeError):
"""Raised when the tlog-timestamp source emits a non-monotonic value.
Replay must be deterministic; a strategy that silently smooths
backward jumps would mask a genuine recording corruption. The error
names the offending pair so the operator can correlate against the
tlog message stream.
"""
class TlogDerivedClock:
"""Replay :class:`Clock` strategy driven by the tlog timestamp stream.
The source can be either a callable returning ``int`` ns (typical
when wired against the live tlog parser, AZ-399) or an iterable of
pre-known ``int`` ns values (typical in unit tests). Both are normalised
to an internal :class:`Iterator` lazily.
Semantics:
- :meth:`monotonic_ns` pulls the next value from the source on every
call and returns it (advance-on-call). The most-recently-returned
value is cached for :meth:`time_ns` so the two methods stay aligned.
- :meth:`time_ns` returns the latest cached value; if no call to
:meth:`monotonic_ns` has happened yet, it returns 0 (the replay
composition root must pump at least one frame before any consumer
asks for wall-clock time).
- :meth:`sleep_until_ns` is a no-op (``pace=ASAP``).
"""
__slots__ = ("_source", "_last_ns")
def __init__(
self,
source: Callable[[], int] | Iterable[int],
) -> None:
if callable(source):
self._source: Iterator[int] = _iter_from_callable(source)
else:
self._source = iter(source)
self._last_ns = 0
def monotonic_ns(self) -> int:
try:
next_ns = next(self._source)
except StopIteration:
return self._last_ns
if next_ns < self._last_ns:
raise ClockOrderingError(
f"TlogDerivedClock: non-monotonic timestamp "
f"{next_ns} ns followed {self._last_ns} ns"
)
self._last_ns = next_ns
return next_ns
def time_ns(self) -> int:
return self._last_ns
def sleep_until_ns(self, target_ns: int) -> None:
"""No-op in ASAP pace (Invariant 6)."""
return None
def _iter_from_callable(source: Callable[[], int]) -> Iterator[int]:
"""Wrap a callable as an iterator that calls it on each ``next()``.
Used when the tlog parser exposes a ``next_timestamp_ns()``-style
hook; consumers should NOT pass a side-effectful callable that
blocks — the source is expected to be cheap (microsecond-class).
"""
while True:
yield source()
__all__ = ["ClockOrderingError", "TlogDerivedClock"]
@@ -0,0 +1,42 @@
"""``WallClock`` strategy (AZ-398) — live + REALTIME replay.
Thin :class:`Clock` adapter over the standard-library :mod:`time`
module. Owned by ``clock/`` so the AC-4 AST scan over ``components/``
remains clean: components MUST NOT call :func:`time.monotonic_ns`
directly; they call :meth:`WallClock.monotonic_ns` via injection.
"""
from __future__ import annotations
import time
class WallClock:
"""Default :class:`Clock` strategy backed by :mod:`time`.
Stateless; constructable without arguments. All three methods are
trivially Liskov-clean over the Protocol.
"""
__slots__ = ()
def monotonic_ns(self) -> int:
return time.monotonic_ns()
def time_ns(self) -> int:
return time.time_ns()
def sleep_until_ns(self, target_ns: int) -> None:
"""Block until ``time.monotonic_ns() >= target_ns``.
A target already in the past is a no-op. Sub-millisecond
oversleep is accepted (AC-5: ≤ 5 ms drift on a 100 ms sleep).
"""
now = time.monotonic_ns()
delta_ns = target_ns - now
if delta_ns <= 0:
return
time.sleep(delta_ns / 1_000_000_000.0)
__all__ = ["WallClock"]
@@ -12,13 +12,15 @@ Retry policy (FAC-INV-5):
from __future__ import annotations
import time
from collections.abc import Callable
from pathlib import Path
from typing import Final
from uuid import UUID
import httpx
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard._types.geo import BoundingBox, LatLonAlt
from gps_denied_onboard.components.c12_operator_tooling.flights_api._parser import (
parse_flight_payload,
@@ -49,6 +51,18 @@ _REDACTED: Final[str] = "<redacted>"
_RETRY_BACKOFF_S: Final[float] = 1.0
def _wall_clock_sleep(seconds: float) -> None:
"""Default sleep hook — routes through :class:`WallClock`.
Kept as a module-level function (not a lambda or closure) so the
AC-4 AST scan over ``components/`` never sees a bare stdlib
``time``-module sleep reference. Tests inject their own ``sleep``
callable to skip the backoff.
"""
clock = WallClock()
clock.sleep_until_ns(clock.monotonic_ns() + int(seconds * 1_000_000_000))
class HttpxFlightsApiClient:
"""Concrete :class:`FlightsApiClient` against the parent-suite ``flights`` REST API.
@@ -64,10 +78,12 @@ class HttpxFlightsApiClient:
self,
*,
transport: httpx.BaseTransport | None = None,
sleep: object = time.sleep,
sleep: Callable[[float], None] | None = None,
) -> None:
self._transport = transport
self._sleep = sleep
self._sleep: Callable[[float], None] = (
sleep if sleep is not None else _wall_clock_sleep
)
self._log = get_logger("c12.flights_api")
def fetch_flight(
@@ -162,7 +178,7 @@ class HttpxFlightsApiClient:
},
},
)
self._sleep(_RETRY_BACKOFF_S) # type: ignore[operator]
self._sleep(_RETRY_BACKOFF_S)
try:
response = client.get(url, headers=headers)
except (httpx.ConnectError, httpx.ConnectTimeout, httpx.ReadTimeout) as exc:
@@ -29,8 +29,10 @@ from collections.abc import Callable, Sequence
from dataclasses import asdict
from datetime import datetime, timezone
from pathlib import Path
from typing import TYPE_CHECKING
from uuid import UUID
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard.components.c13_fdr.errors import (
FdrAlreadyClosedError,
FdrCloseWithoutOpenError,
@@ -53,6 +55,9 @@ from gps_denied_onboard.fdr_client.records import (
)
from gps_denied_onboard.logging import get_logger
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
__all__ = ["FileFdrWriter"]
_FLIGHT_HEADER_KIND = "flight_header"
@@ -97,6 +102,7 @@ class FileFdrWriter:
on_rotation: Callable[[FileFdrWriter, int], None] | None = None,
record_kind_policy: RecordKindPolicy | None = None,
drain_sleep_s: float = _DEFAULT_DRAIN_SLEEP_S,
clock: Clock | None = None,
) -> None:
self._flight_root = Path(flight_root)
self._flight_id = flight_id
@@ -106,6 +112,7 @@ class FileFdrWriter:
self._on_rotation = on_rotation
self._record_kind_policy = record_kind_policy
self._drain_sleep_s = drain_sleep_s
self._clock: Clock = clock if clock is not None else WallClock()
# Filesystem state.
self._flight_dir: Path = self._flight_root / str(flight_id)
@@ -312,7 +319,7 @@ class FileFdrWriter:
# iterate until the value is stable. Practically this converges
# in one or two passes.
ts = _iso_now()
mono_ns = time.monotonic_ns()
mono_ns = self._clock.monotonic_ns()
records_written_now = self._records_written + 1 # +1 for the footer itself
bytes_estimate = self._bytes_written
footer: FlightFooter | None = None
@@ -1,6 +1,53 @@
"""C1 VIO component — Public API."""
"""C1 VIO — Public API (AZ-331).
from gps_denied_onboard._types.vio import VioOutput
Per ``vio_strategy_protocol.md`` v1.0.0 the public surface consists
of:
- :class:`VioStrategy` Protocol (4 methods).
- DTOs / enum re-exported from :mod:`gps_denied_onboard._types.nav`
(the L1 home for cross-component DTOs): :class:`VioOutput`,
:class:`VioHealth`, :class:`FeatureQuality`, :class:`WarmStartPose`,
:class:`VioState`.
- Error family rooted at :class:`VioError`; three documented subtypes.
- Config block :class:`C1VioConfig` (registered on import).
Concrete strategies (``Okvis2Strategy``, ``VinsMonoStrategy``,
``KltRansacStrategy``) live in sibling modules and are imported
lazily by :mod:`gps_denied_onboard.runtime_root.vio_factory` —
Risk-2 mitigation: this ``__init__.py`` MUST NOT import any concrete
strategy module.
"""
from gps_denied_onboard._types.nav import (
FeatureQuality,
VioHealth,
VioOutput,
VioState,
WarmStartPose,
)
from gps_denied_onboard.components.c1_vio.config import C1VioConfig, Okvis2Config
from gps_denied_onboard.components.c1_vio.errors import (
VioDegradedError,
VioError,
VioFatalError,
VioInitializingError,
)
from gps_denied_onboard.components.c1_vio.interface import VioStrategy
from gps_denied_onboard.config.schema import register_component_block
__all__ = ["VioOutput", "VioStrategy"]
register_component_block("c1_vio", C1VioConfig)
__all__ = [
"C1VioConfig",
"FeatureQuality",
"Okvis2Config",
"VioDegradedError",
"VioError",
"VioFatalError",
"VioHealth",
"VioInitializingError",
"VioOutput",
"VioState",
"VioStrategy",
"WarmStartPose",
]
@@ -0,0 +1,318 @@
// AZ-332 — pybind11 binding for OKVIS2 (production-default C1 VIO).
//
// Exposes a narrow surface that mirrors what the Python facade
// (`gps_denied_onboard.components.c1_vio.okvis2.Okvis2Strategy`)
// needs — NOT the full OKVIS2 estimator API. The surface is:
//
// Okvis2Backend
// ctor(yaml_config: str, camera_intrinsics_3x3: ndarray[float64, 3, 3])
// add_frame(frame_id: str, ts_ns: int, image: ndarray[uint8, H, W, C]) -> bool
// add_imu(ts_ns: int, accel: ndarray[float64, 3], gyro: ndarray[float64, 3]) -> None
// get_latest_output() -> dict | None
// reset(body_T_world: ndarray[float64, 4, 4], velocity: ndarray[float64, 3],
// accel_bias: ndarray[float64, 3], gyro_bias: ndarray[float64, 3]) -> None
// health() -> dict
//
// Frame buffers cross the FFI boundary as `py::array_t<uint8_t,
// c_style|forcecast>` so the camera-ingest path (AZ-265
// LiveCameraFrameSource) can hand off a contiguous numpy array without a
// copy — Risk-2 mitigation per the AZ-332 task spec.
//
// Exception envelope: every OKVIS2 / Eigen / std::runtime_error inside a
// binding method is caught and rethrown as one of three Python-side
// exceptions registered via `py::register_exception`. The Python facade
// then rewraps those into the VioError family.
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <pybind11/stl.h>
#include <Eigen/Core>
#include <Eigen/Geometry>
#include <array>
#include <cstdint>
#include <memory>
#include <mutex>
#include <optional>
#include <stdexcept>
#include <string>
// OKVIS2 estimator headers. The exact include path is determined by the
// vendored upstream's CMake export. The skeleton compiles without these
// headers because the actual ThreadedKFVio wiring lives in
// _build_estimator() / _drive_estimator(), which today STUB and surface a
// runtime error if invoked. Wiring them in is the follow-up task within
// AZ-332's tier2 deliverable bundle.
//
// #include <okvis/ThreadedKFVio.hpp>
// #include <okvis/Estimator.hpp>
// #include <okvis/VioParametersReader.hpp>
namespace py = pybind11;
namespace {
// ---------------------------------------------------------------------------
// Exception types — registered as Python-side classes via
// `py::register_exception` in PYBIND11_MODULE below. The Python facade
// catches these and rewraps into the VioError family.
class OkvisInitException : public std::runtime_error {
public:
using std::runtime_error::runtime_error;
};
class OkvisFatalException : public std::runtime_error {
public:
using std::runtime_error::runtime_error;
};
class OkvisOptimizationException : public std::runtime_error {
public:
using std::runtime_error::runtime_error;
};
// ---------------------------------------------------------------------------
// Pose / output struct produced by the estimator step.
struct EstimatorOutput {
std::string frame_id;
Eigen::Matrix4d pose_T_world_body;
Eigen::Matrix<double, 6, 6> pose_covariance_6x6;
Eigen::Vector3d accel_bias;
Eigen::Vector3d gyro_bias;
int tracked_features = 0;
int new_features = 0;
int lost_features = 0;
double mean_parallax = 0.0;
double mre_px = 0.0;
std::int64_t emitted_at_ns = 0;
};
// ---------------------------------------------------------------------------
// Internal estimator state machine — INIT until N keyframes converge,
// TRACKING during nominal operation, DEGRADED on feature-count drop,
// LOST after consecutive failed updates.
enum class HealthState : int { Init = 0, Tracking = 1, Degraded = 2, Lost = 3 };
const char* state_to_str(HealthState s) {
switch (s) {
case HealthState::Init:
return "init";
case HealthState::Tracking:
return "tracking";
case HealthState::Degraded:
return "degraded";
case HealthState::Lost:
return "lost";
}
return "init";
}
// ---------------------------------------------------------------------------
// Okvis2Backend — the C++ surface exposed to Python.
class Okvis2Backend {
public:
Okvis2Backend(const std::string& yaml_config,
py::array_t<double, py::array::c_style | py::array::forcecast>
camera_intrinsics_3x3)
: yaml_config_(yaml_config) {
if (camera_intrinsics_3x3.ndim() != 2 ||
camera_intrinsics_3x3.shape(0) != 3 ||
camera_intrinsics_3x3.shape(1) != 3) {
throw OkvisInitException(
"Okvis2Backend: camera_intrinsics_3x3 must be a 3x3 float64 array");
}
auto buf = camera_intrinsics_3x3.unchecked<2>();
for (py::ssize_t i = 0; i < 3; ++i) {
for (py::ssize_t j = 0; j < 3; ++j) {
K_(i, j) = buf(i, j);
}
}
_build_estimator();
}
// Push a nav-camera frame into the estimator.
// Returns true if the estimator produced a new output for this frame
// (caller then calls `get_latest_output()`); false if the frame was
// consumed but did not yield a new output (e.g. dropped as non-keyframe).
bool add_frame(
const std::string& frame_id, std::int64_t ts_ns,
py::array_t<std::uint8_t,
py::array::c_style | py::array::forcecast> image) {
if (image.ndim() < 2 || image.ndim() > 3) {
throw OkvisOptimizationException(
"Okvis2Backend.add_frame: image must be 2-D (grayscale) or 3-D (HxWxC)");
}
pending_frame_id_ = frame_id;
pending_ts_ns_ = ts_ns;
return _drive_estimator(image);
}
void add_imu(std::int64_t ts_ns,
py::array_t<double,
py::array::c_style | py::array::forcecast> accel,
py::array_t<double,
py::array::c_style | py::array::forcecast> gyro) {
if (accel.size() != 3 || gyro.size() != 3) {
throw OkvisOptimizationException(
"Okvis2Backend.add_imu: accel and gyro must be length-3 float64 arrays");
}
if (ts_ns <= last_imu_ts_ns_) {
throw OkvisOptimizationException(
"Okvis2Backend.add_imu: ts_ns must be strict-monotonic");
}
last_imu_ts_ns_ = ts_ns;
// Real OKVIS2 IMU push lands here once the estimator is wired in.
// For the skeleton we just record the most recent sample — the
// estimator's IMU integration is performed inside ThreadedKFVio.
auto a = accel.unchecked<1>();
auto g = gyro.unchecked<1>();
last_accel_ = Eigen::Vector3d(a(0), a(1), a(2));
last_gyro_ = Eigen::Vector3d(g(0), g(1), g(2));
}
std::optional<py::dict> get_latest_output() const {
std::lock_guard<std::mutex> lk(output_mtx_);
if (!latest_output_.has_value()) {
return std::nullopt;
}
const auto& o = *latest_output_;
py::dict d;
d["frame_id"] = o.frame_id;
d["pose_T_world_body"] = py::array_t<double>(
{4, 4}, {sizeof(double) * 4, sizeof(double)},
o.pose_T_world_body.data());
d["pose_covariance_6x6"] = py::array_t<double>(
{6, 6}, {sizeof(double) * 6, sizeof(double)},
o.pose_covariance_6x6.data());
d["accel_bias"] = py::array_t<double>(
{3}, {sizeof(double)}, o.accel_bias.data());
d["gyro_bias"] = py::array_t<double>(
{3}, {sizeof(double)}, o.gyro_bias.data());
d["tracked_features"] = o.tracked_features;
d["new_features"] = o.new_features;
d["lost_features"] = o.lost_features;
d["mean_parallax"] = o.mean_parallax;
d["mre_px"] = o.mre_px;
d["emitted_at_ns"] = o.emitted_at_ns;
return d;
}
void reset(py::array_t<double,
py::array::c_style | py::array::forcecast> body_T_world,
py::array_t<double,
py::array::c_style | py::array::forcecast> velocity,
py::array_t<double,
py::array::c_style | py::array::forcecast> accel_bias,
py::array_t<double,
py::array::c_style | py::array::forcecast> gyro_bias) {
if (body_T_world.ndim() != 2 || body_T_world.shape(0) != 4 ||
body_T_world.shape(1) != 4) {
throw OkvisInitException(
"Okvis2Backend.reset: body_T_world must be a 4x4 float64 array");
}
if (velocity.size() != 3 || accel_bias.size() != 3 || gyro_bias.size() != 3) {
throw OkvisInitException(
"Okvis2Backend.reset: velocity / *_bias must be length-3 float64 arrays");
}
auto T = body_T_world.unchecked<2>();
for (py::ssize_t i = 0; i < 4; ++i) {
for (py::ssize_t j = 0; j < 4; ++j) {
seed_body_T_world_(i, j) = T(i, j);
}
}
auto v = velocity.unchecked<1>();
auto ab = accel_bias.unchecked<1>();
auto gb = gyro_bias.unchecked<1>();
seed_velocity_ = Eigen::Vector3d(v(0), v(1), v(2));
seed_accel_bias_ = Eigen::Vector3d(ab(0), ab(1), ab(2));
seed_gyro_bias_ = Eigen::Vector3d(gb(0), gb(1), gb(2));
state_ = HealthState::Init;
consecutive_lost_ = 0;
{
std::lock_guard<std::mutex> lk(output_mtx_);
latest_output_.reset();
}
_build_estimator();
}
py::dict health() const {
py::dict d;
d["state"] = std::string(state_to_str(state_));
d["consecutive_lost"] = consecutive_lost_;
d["bias_norm"] = std::sqrt(
seed_accel_bias_.squaredNorm() + seed_gyro_bias_.squaredNorm());
return d;
}
private:
void _build_estimator() {
// Real wiring: instantiate okvis::ThreadedKFVio from yaml_config_,
// attach output callback that fills latest_output_ under output_mtx_.
//
// The skeleton intentionally throws on any actual frame ingest so a
// production binary that loads this binding before AZ-332's
// estimator wiring lands cannot silently report misleading poses.
estimator_built_ = false;
}
bool _drive_estimator(
py::array_t<std::uint8_t,
py::array::c_style | py::array::forcecast> /*image*/) {
if (!estimator_built_) {
// Skeleton path — pybind11 binding compiles and loads but the
// OKVIS2 estimator is not yet wired. Tier-2 follow-up wires it up.
throw OkvisFatalException(
"Okvis2Backend: OKVIS2 estimator not yet wired — this binding "
"is the AZ-332 skeleton; tier2 follow-up wires okvis::ThreadedKFVio");
}
return false;
}
std::string yaml_config_;
Eigen::Matrix3d K_ = Eigen::Matrix3d::Identity();
Eigen::Matrix4d seed_body_T_world_ = Eigen::Matrix4d::Identity();
Eigen::Vector3d seed_velocity_ = Eigen::Vector3d::Zero();
Eigen::Vector3d seed_accel_bias_ = Eigen::Vector3d::Zero();
Eigen::Vector3d seed_gyro_bias_ = Eigen::Vector3d::Zero();
Eigen::Vector3d last_accel_ = Eigen::Vector3d::Zero();
Eigen::Vector3d last_gyro_ = Eigen::Vector3d::Zero();
HealthState state_ = HealthState::Init;
int consecutive_lost_ = 0;
std::int64_t last_imu_ts_ns_ = -1;
std::string pending_frame_id_;
std::int64_t pending_ts_ns_ = 0;
bool estimator_built_ = false;
mutable std::mutex output_mtx_;
std::optional<EstimatorOutput> latest_output_;
};
} // namespace
PYBIND11_MODULE(okvis2_binding, m) {
m.doc() =
"OKVIS2 pybind11 binding (AZ-332). Wraps okvis::ThreadedKFVio for the "
"Python Okvis2Strategy facade. Tier2 follow-up wires the real estimator.";
py::register_exception<OkvisInitException>(m, "OkvisInitException");
py::register_exception<OkvisFatalException>(m, "OkvisFatalException");
py::register_exception<OkvisOptimizationException>(
m, "OkvisOptimizationException");
py::class_<Okvis2Backend>(m, "Okvis2Backend")
.def(py::init<const std::string&,
py::array_t<double, py::array::c_style | py::array::forcecast>>(),
py::arg("yaml_config"), py::arg("camera_intrinsics_3x3"))
.def("add_frame", &Okvis2Backend::add_frame, py::arg("frame_id"),
py::arg("ts_ns"), py::arg("image"))
.def("add_imu", &Okvis2Backend::add_imu, py::arg("ts_ns"),
py::arg("accel"), py::arg("gyro"))
.def("get_latest_output", &Okvis2Backend::get_latest_output)
.def("reset", &Okvis2Backend::reset, py::arg("body_T_world"),
py::arg("velocity"), py::arg("accel_bias"), py::arg("gyro_bias"))
.def("health", &Okvis2Backend::health);
}
@@ -0,0 +1,6 @@
"""C1 VIO microbench harness (AZ-332).
The bench scripts are tier2 / Jetson-only — they exercise the real OKVIS2
binding (or fake binding for cross-platform smoke) and report per-frame
latency percentiles for C1-PT-01 / NFT-PERF-01.
"""
@@ -0,0 +1,196 @@
"""``python -m gps_denied_onboard.components.c1_vio.bench.okvis2`` (AZ-332).
Microbench for :class:`Okvis2Strategy` — reads a fixture directory of
nav-camera frames + IMU samples and reports per-frame latency
percentiles for C1-PT-01 (p50 <= 25 ms, p95 <= 80 ms, threshold 120 ms).
The bench produces production behaviour: it constructs the real
strategy via the AZ-331 factory (so ``BUILD_OKVIS2=ON`` is required),
feeds real frames through, and measures wall-clock per call. On Tier-2
this measures OKVIS2's actual estimator latency; on a workstation with
``BUILD_OKVIS2=OFF`` it refuses to start (Risk-2 — never silently
benchmark a stub).
"""
from __future__ import annotations
import argparse
import json
import sys
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
import numpy as np
from gps_denied_onboard._types.nav import (
ImuSample,
ImuWindow,
NavCameraFrame,
)
from gps_denied_onboard.components.c1_vio.config import (
C1VioConfig,
Okvis2Config,
)
from gps_denied_onboard.config.schema import Config, RuntimeConfig
from gps_denied_onboard.fdr_client.client import make_fdr_client
from gps_denied_onboard.runtime_root.vio_factory import build_vio_strategy
def _percentile(samples_ms: list[float], pct: float) -> float:
if not samples_ms:
return float("nan")
sorted_samples = sorted(samples_ms)
idx = min(len(sorted_samples) - 1, int(pct * len(sorted_samples)))
return sorted_samples[idx]
def _load_fixture(fixture_dir: Path) -> tuple[Any, list[NavCameraFrame], list[ImuWindow]]:
"""Fixture format (minimal, deterministic):
.. code::
fixture_dir/
manifest.json { "frame_count": N, "camera_calibration_path": "..." }
frames/0000.npy uint8 image
...
imu/0000.json {"samples": [{"ts_ns": N, "accel": [..], "gyro": [..]}, ...]}
...
"""
manifest_path = fixture_dir / "manifest.json"
if not manifest_path.is_file():
raise FileNotFoundError(f"missing manifest.json under {fixture_dir!r}")
manifest = json.loads(manifest_path.read_text(encoding="utf-8"))
frames: list[NavCameraFrame] = []
imu_windows: list[ImuWindow] = []
frame_count = int(manifest["frame_count"])
for i in range(frame_count):
img_path = fixture_dir / "frames" / f"{i:04d}.npy"
imu_path = fixture_dir / "imu" / f"{i:04d}.json"
img = np.load(img_path)
imu_blob = json.loads(imu_path.read_text(encoding="utf-8"))
samples = tuple(
ImuSample(
ts_ns=int(s["ts_ns"]),
accel_xyz=tuple(s["accel"]),
gyro_xyz=tuple(s["gyro"]),
)
for s in imu_blob["samples"]
)
if not samples:
raise ValueError(
f"bench.okvis2: fixture frame {i} ({imu_path}) has no IMU "
"samples — bench requires a real IMU window per frame"
)
ts_start = samples[0].ts_ns
ts_end = samples[-1].ts_ns
imu_windows.append(ImuWindow(samples=samples, ts_start_ns=ts_start, ts_end_ns=ts_end))
frames.append(
NavCameraFrame(
frame_id=i,
timestamp=datetime.fromtimestamp(ts_start * 1e-9, tz=timezone.utc),
image=img,
camera_calibration_id=str(manifest.get("camera_calibration_id", "bench")),
)
)
return manifest, frames, imu_windows
def _make_calibration(intrinsics_path: str | None) -> Any:
"""Build a CameraCalibration with no body-to-camera (identity)
using the bench's calibration JSON if supplied; otherwise raise.
"""
from gps_denied_onboard._types.calibration import CameraCalibration
if intrinsics_path is None:
raise ValueError("bench.okvis2: --camera-calibration is required (real intrinsics)")
blob = json.loads(Path(intrinsics_path).read_text(encoding="utf-8"))
return CameraCalibration(
camera_id=blob.get("camera_id", "bench"),
intrinsics_3x3=np.asarray(blob["intrinsics_3x3"], dtype=np.float64),
distortion=np.asarray(blob.get("distortion", [0, 0, 0, 0]), dtype=np.float64),
body_to_camera_se3=np.eye(4, dtype=np.float64),
acquisition_method=blob.get("acquisition_method", "bench-static"),
metadata=dict(blob.get("metadata", {})),
)
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser(
prog="python -m gps_denied_onboard.components.c1_vio.bench.okvis2",
description="Microbench for Okvis2Strategy.process_frame (AZ-332 / C1-PT-01).",
)
parser.add_argument("fixture_dir", type=Path, help="Path to fixture directory")
parser.add_argument(
"--camera-calibration",
type=str,
required=True,
help="Path to camera calibration JSON",
)
parser.add_argument(
"--warmup",
type=int,
default=10,
help="Number of warmup frames (not counted in percentiles)",
)
args = parser.parse_args(argv)
manifest, frames, imu_windows = _load_fixture(args.fixture_dir)
calibration = _make_calibration(args.camera_calibration)
config = Config.with_blocks(
c1_vio=C1VioConfig(strategy="okvis2", okvis2=Okvis2Config()),
runtime=RuntimeConfig(
camera_calibration_path=args.camera_calibration,
inference_backend="tensorrt",
tier=2,
),
)
fdr_client = make_fdr_client("c1_vio.okvis2.bench", config)
strategy = build_vio_strategy(config, fdr_client=fdr_client)
durations_ms: list[float] = []
for i, (frame, imu) in enumerate(zip(frames, imu_windows, strict=True)):
t0 = time.perf_counter()
try:
strategy.process_frame(frame, imu, calibration)
except Exception as exc:
print(
f"frame {i}: exception {type(exc).__name__}: {exc}",
file=sys.stderr,
)
continue
dt_ms = (time.perf_counter() - t0) * 1000.0
if i >= args.warmup:
durations_ms.append(dt_ms)
if not durations_ms:
print("bench: no successful frames after warmup", file=sys.stderr)
return 2
p50 = _percentile(durations_ms, 0.50)
p95 = _percentile(durations_ms, 0.95)
p99 = _percentile(durations_ms, 0.99)
print(
json.dumps(
{
"fixture_dir": str(args.fixture_dir),
"frame_count": manifest.get("frame_count"),
"measured": len(durations_ms),
"p50_ms": round(p50, 3),
"p95_ms": round(p95, 3),
"p99_ms": round(p99, 3),
"c1_pt_01_target_p50_ms": 25.0,
"c1_pt_01_target_p95_ms": 80.0,
"c1_pt_01_failure_p95_ms": 120.0,
},
indent=2,
)
)
return 0
if __name__ == "__main__":
raise SystemExit(main())
@@ -0,0 +1,128 @@
"""C1 VIO strategy config block (AZ-331 + AZ-332).
Registered into ``config.components['c1_vio']`` by the package
``__init__.py``. The composition-root factory
:func:`gps_denied_onboard.runtime_root.vio_factory.build_vio_strategy`
reads this block to select the strategy and configure the LOST->FATAL
transition + warm-start convergence budget.
AZ-332 extends this with a nested :class:`Okvis2Config` sub-block
carrying OKVIS2-specific knobs (sliding-window size, parallax-driven
keyframe threshold, RANSAC inlier ratio, max optimisation iterations,
degraded-feature threshold, per-frame debug log). Only consulted when
``strategy == "okvis2"``.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Final
from gps_denied_onboard.config.schema import ConfigError
__all__ = [
"KNOWN_STRATEGIES",
"C1VioConfig",
"Okvis2Config",
]
KNOWN_STRATEGIES: Final[frozenset[str]] = frozenset({"okvis2", "vins_mono", "klt_ransac"})
@dataclass(frozen=True)
class Okvis2Config:
"""OKVIS2-specific knobs (AZ-332).
``keyframe_window_size`` is the sliding-window keyframe count K
per D-C5-3 — must be in [10, 20]. Lower values lose accuracy;
higher values exceed the C1-PT-01 per-frame budget on Tier-2.
``keyframe_parallax_threshold_px`` is the parallax-driven keyframe
decision; default 3.0 px (OKVIS2 upstream default).
``ransac_inlier_ratio`` is the RANSAC inlier-ratio threshold below
which the frontend declares the frame untrackable; default 0.5.
``max_optimization_iters`` caps the per-frame Levenberg-Marquardt
iterations to bound worst-case latency; default 4 (OKVIS2 default).
``degraded_feature_threshold`` is the per-frame tracked-feature
count below which ``health_snapshot`` reports DEGRADED; default 30.
``per_frame_debug_log`` enables a DEBUG log line per ``process_frame``
— OFF by default (would flood at 3 Hz steady-state).
"""
keyframe_window_size: int = 15
keyframe_parallax_threshold_px: float = 3.0
ransac_inlier_ratio: float = 0.5
max_optimization_iters: int = 4
degraded_feature_threshold: int = 30
per_frame_debug_log: bool = False
def __post_init__(self) -> None:
if not (10 <= self.keyframe_window_size <= 20):
raise ConfigError(
"Okvis2Config.keyframe_window_size must be in [10, 20] "
f"(D-C5-3 budget); got {self.keyframe_window_size}"
)
if self.keyframe_parallax_threshold_px <= 0.0:
raise ConfigError(
"Okvis2Config.keyframe_parallax_threshold_px must be > 0; "
f"got {self.keyframe_parallax_threshold_px}"
)
if not (0.0 < self.ransac_inlier_ratio <= 1.0):
raise ConfigError(
"Okvis2Config.ransac_inlier_ratio must be in (0.0, 1.0]; "
f"got {self.ransac_inlier_ratio}"
)
if self.max_optimization_iters < 1:
raise ConfigError(
"Okvis2Config.max_optimization_iters must be >= 1; "
f"got {self.max_optimization_iters}"
)
if self.degraded_feature_threshold < 1:
raise ConfigError(
"Okvis2Config.degraded_feature_threshold must be >= 1; "
f"got {self.degraded_feature_threshold}"
)
@dataclass(frozen=True)
class C1VioConfig:
"""Per-component config for C1 VIO.
``strategy`` selects exactly one of the three backends
(``okvis2`` / ``vins_mono`` / ``klt_ransac``); the
composition-root factory respects compile-time ``BUILD_*`` gating
on top of this label.
``lost_frame_threshold`` is the number of consecutive ``LOST``
frames before the strategy raises :class:`VioFatalError`;
default 9 per ``vio_strategy_protocol.md`` v1.0.0.
``warm_start_max_frames`` is the convergence budget after
:meth:`VioStrategy.reset_to_warm_start`; default 5.
``okvis2`` carries OKVIS2-specific knobs (AZ-332); consulted only
when ``strategy == "okvis2"``.
"""
strategy: str = "klt_ransac"
lost_frame_threshold: int = 9
warm_start_max_frames: int = 5
okvis2: Okvis2Config = field(default_factory=Okvis2Config)
def __post_init__(self) -> None:
if self.strategy not in KNOWN_STRATEGIES:
raise ConfigError(
f"C1VioConfig.strategy={self.strategy!r} not in {sorted(KNOWN_STRATEGIES)}"
)
if self.lost_frame_threshold < 1:
raise ConfigError(
f"C1VioConfig.lost_frame_threshold must be >= 1; got {self.lost_frame_threshold}"
)
if self.warm_start_max_frames < 1:
raise ConfigError(
f"C1VioConfig.warm_start_max_frames must be >= 1; got {self.warm_start_max_frames}"
)
@@ -0,0 +1,64 @@
"""C1 VioStrategy error taxonomy (AZ-331).
Every ``VioStrategy`` method raises only members of :class:`VioError`.
Lower-level exceptions from OpenCV / OKVIS2 / VINS-Mono / GTSAM MUST
be caught and rewrapped — the contract closes the error envelope so
consumers can ``except VioError`` once and handle the family.
:class:`VioDegradedError` is documented but is **not raised** during
the normal degraded-operation path: degraded operation returns a
``VioOutput`` with inflated covariance and ``VioHealth.state == DEGRADED``.
The error type exists for the rare degraded→fatal transition.
A separate composition-time error
(:class:`gps_denied_onboard.runtime_root.errors.StrategyNotAvailableError`)
lives outside this family — it is raised by the factory, not by a
``VioStrategy`` method.
"""
from __future__ import annotations
__all__ = [
"VioDegradedError",
"VioError",
"VioFatalError",
"VioInitializingError",
]
class VioError(Exception):
"""Base class for the C1 VIO error family.
All three documented subtypes are children. Consumers catch the
family with ``except c1_vio.errors.VioError as e``.
"""
class VioInitializingError(VioError):
"""Raised while ``VioHealth.state == INIT`` and the strategy has no
pose to emit.
C5 fusion catches this and falls back to the FC IMU prior until
the strategy reports ``TRACKING``.
"""
class VioDegradedError(VioError):
"""Raised on the rare degraded→fatal transition.
The normal degraded-operation path is NOT this exception — it is
a ``VioOutput`` with inflated covariance + ``VioHealth.state ==
DEGRADED``. This type exists only so consumer ``except VioError``
wrappers can name the family member explicitly.
"""
class VioFatalError(VioError):
"""Raised once ``consecutive_lost`` exceeds the configured threshold
(default 9) or on irrecoverable backend init failure during
:meth:`VioStrategy.reset_to_warm_start`.
The AC-5.2 fallback path engages once this fires: the camera
ingest loop stops feeding the strategy and the watchdog flips
the composition root into FC-IMU-only mode.
"""
@@ -1,20 +1,94 @@
"""C1 `VioStrategy` Protocol.
"""C1 ``VioStrategy`` Protocol (AZ-331).
Concrete strategies: OKVIS2 (default), VINS-Mono (research-only), KLT/RANSAC
(mandatory simple baseline). See `_docs/02_document/components/01_c1_vio/`.
PEP 544 ``typing.Protocol`` with ``runtime_checkable=True``; four
methods that span the camera-ingest hot path
(``process_frame``), F8 reboot recovery (``reset_to_warm_start``),
diagnostics (``health_snapshot``), and self-identification
(``current_strategy_label``).
Concrete impls — :class:`Okvis2Strategy` (AZ-332),
:class:`VinsMonoStrategy` (AZ-333), :class:`KltRansacStrategy`
(AZ-334) — live in sibling modules and are imported lazily by
:mod:`gps_denied_onboard.runtime_root.vio_factory`.
The contract at
``_docs/02_document/contracts/c1_vio/vio_strategy_protocol.md``
v1.0.0 is the authoritative shape; this module mirrors it 1:1.
"""
from __future__ import annotations
from typing import Protocol
from typing import TYPE_CHECKING, Literal, Protocol, runtime_checkable
from gps_denied_onboard._types.nav import ImuWindow, NavCameraFrame
from gps_denied_onboard._types.vio import VioOutput
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.nav import (
ImuWindow,
NavCameraFrame,
VioHealth,
VioOutput,
WarmStartPose,
)
__all__ = ["VioStrategy"]
@runtime_checkable
class VioStrategy(Protocol):
"""Visual-Inertial-Odometry strategy."""
"""On-Jetson visual / visual-inertial odometry runtime.
def step(self, frame: NavCameraFrame, imu: ImuWindow) -> VioOutput:
"""Process a single nav-camera frame + IMU window and return a VIO update."""
Implementations:
:class:`Okvis2Strategy` (production-default, OKVIS2 SLAM),
:class:`VinsMonoStrategy` (research-only),
:class:`KltRansacStrategy` (mandatory simple-baseline per ADR-002).
Selection is owned by the composition root.
Invariants (see ``vio_strategy_protocol.md`` v1.0.0):
- Single-threaded per instance (one camera-ingest writer thread).
- ``current_strategy_label()`` is constant per instance and
equals ``config.vio.strategy`` exactly.
- Error envelope is closed — only members of
:class:`VioError` escape ``process_frame``.
"""
def process_frame(
self,
frame: "NavCameraFrame",
imu: "ImuWindow",
calibration: "CameraCalibration",
) -> "VioOutput":
"""Camera-ingest hot-path call (one per nav-camera frame).
``VioOutput.frame_id`` MUST equal ``frame.frame_id`` (C5
alignment invariant). Raises :class:`VioInitializingError`
while booting (state == INIT; no output emitted) and
:class:`VioFatalError` once ``consecutive_lost`` exceeds the
configured threshold. During DEGRADED operation the method
returns normally with an inflated covariance — NOT raises
:class:`VioDegradedError`.
"""
...
def reset_to_warm_start(self, hint: "WarmStartPose") -> None:
"""Destructive re-init from an F8-reboot warm-start hint.
Clears keyframe window, IMU integration state, feature
tracks. Subsequent ``process_frame`` calls re-initialise from
the hint. Raises :class:`VioFatalError` only on irrecoverable
backend init failure.
"""
...
def health_snapshot(self) -> "VioHealth":
"""Most-recent health state — FDR-stamped per the AC-NEW-3 audit."""
...
def current_strategy_label(
self,
) -> Literal["okvis2", "vins_mono", "klt_ransac"]:
"""Identify which concrete strategy is wired here.
Returned string equals ``config.vio.strategy`` exactly.
AC-NEW-3 FDR audit relies on this property.
"""
...
@@ -0,0 +1,488 @@
"""`Okvis2Strategy` — production-default C1 VIO (AZ-332).
Python facade over the OKVIS2 C++ tightly-coupled keyframe-based VIO
core, accessed via the pybind11 binding at
``_native.okvis2_binding.Okvis2Backend`` (compiled by
``cpp/okvis2/CMakeLists.txt``, gated by ``BUILD_OKVIS2=ON``).
Conforms to the AZ-331 :class:`VioStrategy` Protocol; consumes the
runtime ``Config`` + an :class:`FdrClient`; constructs its other
dependencies (logger, camera calibration, preintegrator) internally
from ``config`` so the strategy class matches the composition-root
factory shape::
strategy_cls(config: Config, *, fdr_client: FdrClient)
Risk-2 mitigation: the native binding is imported **lazily inside the
constructor**, not at module top level. Importing this module with
``BUILD_OKVIS2=OFF`` (no compiled ``.so``) is safe — the factory's
build-flag gate catches that path before the constructor runs.
AC mapping (see ``_docs/02_tasks/todo/AZ-332_c1_okvis2_strategy.md``):
- AC-1 : :meth:`current_strategy_label` returns ``"okvis2"``.
- AC-2 : :meth:`process_frame` returns :class:`VioOutput` with
``frame_id`` echoed; covariance SPD; ``imu_bias`` non-None.
- AC-3 : all backend / Eigen / std::runtime_error rewrap into
:class:`VioError` family with ``__cause__`` chain.
- AC-4 : :meth:`reset_to_warm_start` clears state + seeds hint; second
consecutive call does not raise.
- AC-5 : :meth:`health_snapshot` returns INIT initially, TRACKING after
``warm_start_max_frames`` (default 5) successful frames.
- AC-6 : DEGRADED on feature loss; covariance Frobenius norm strictly
increases; ``process_frame`` still returns :class:`VioOutput` (not raise).
- AC-7 : after ``lost_frame_threshold`` (default 9) consecutive failed
frames, raises :class:`VioFatalError`; state == LOST.
- AC-8 : ``BUILD_OKVIS2=OFF`` does not load this module (enforced by
AZ-331 factory; covered in
``tests/unit/c1_vio/test_protocol_conformance.py``).
- AC-9 / NFR-perf : tier2 — Jetson + Derkachi-class fixture; tests
marked ``@pytest.mark.tier2``.
- AC-10 : exactly one ``vio.health`` FDR record per state transition;
no spam on steady-state.
"""
from __future__ import annotations
import math
from datetime import datetime, timezone
from typing import TYPE_CHECKING, Any, Final, Literal
import numpy as np
from gps_denied_onboard._types.nav import (
FeatureQuality,
ImuBias,
VioHealth,
VioOutput,
VioState,
)
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard.components.c1_vio.errors import (
VioFatalError,
VioInitializingError,
)
from gps_denied_onboard.fdr_client.records import CURRENT_SCHEMA_VERSION, FdrRecord
from gps_denied_onboard.logging import get_logger
if TYPE_CHECKING:
import numpy.typing as npt
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.nav import (
ImuWindow,
NavCameraFrame,
WarmStartPose,
)
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c1_vio.config import Okvis2Config
from gps_denied_onboard.config import Config
from gps_denied_onboard.fdr_client.client import FdrClient
__all__ = ["Okvis2Strategy"]
_STRATEGY_LABEL: Final[Literal["okvis2"]] = "okvis2"
_PRODUCER_ID: Final[str] = "c1_vio.okvis2"
_LOGGER_COMPONENT: Final[str] = "c1_vio.okvis2"
_BIAS_NORM_FLOOR: Final[float] = 0.0
def _now_iso() -> str:
return datetime.now(timezone.utc).isoformat()
def _bias_norm(bias: ImuBias) -> float:
"""L2 norm of the concatenated 6-vector ``(accel || gyro)``."""
accel = np.asarray(bias.accel_bias, dtype=np.float64)
gyro = np.asarray(bias.gyro_bias, dtype=np.float64)
return float(math.sqrt(float(np.dot(accel, accel) + np.dot(gyro, gyro))))
def _se3_from_4x4(matrix: npt.NDArray[Any]) -> Any:
"""Build a ``gtsam.Pose3`` from a 4x4 row-major matrix.
Imported lazily so this module can be imported without gtsam in
headless tooling paths (tests + facade-only smoke).
"""
import gtsam
return gtsam.Pose3(np.asarray(matrix, dtype=np.float64))
class Okvis2Strategy:
"""Production-default :class:`VioStrategy` for E-C1 (AZ-332).
Constructor matches the AZ-331 composition-root factory shape::
Okvis2Strategy(config: Config, *, fdr_client: FdrClient)
Other dependencies (calibration, preintegrator-substrate, logger,
OKVIS2 sub-config) are resolved internally from ``config``.
Concurrency: single-threaded by Protocol invariant. One instance
per camera-ingest writer thread; concurrent ``process_frame`` calls
are undefined behaviour.
"""
def __init__(
self,
config: Config,
*,
fdr_client: FdrClient,
clock: Clock | None = None,
) -> None:
c1_block = config.components["c1_vio"]
if c1_block.strategy != _STRATEGY_LABEL:
raise VioFatalError(
f"Okvis2Strategy constructed with config.strategy="
f"{c1_block.strategy!r}; expected {_STRATEGY_LABEL!r}. "
"The AZ-331 factory is the only sanctioned constructor."
)
self._config = config
self._fdr = fdr_client
self._clock: Clock = clock if clock is not None else WallClock()
self._logger = get_logger(_LOGGER_COMPONENT)
self._lost_frame_threshold: int = c1_block.lost_frame_threshold
self._warm_start_max_frames: int = c1_block.warm_start_max_frames
self._okvis2_cfg: Okvis2Config = c1_block.okvis2
self._calibration: CameraCalibration | None = None
self._frames_since_warmup: int = 0
self._consecutive_lost: int = 0
self._latest_bias: ImuBias = ImuBias(accel_bias=(0.0, 0.0, 0.0), gyro_bias=(0.0, 0.0, 0.0))
self._reported_state: VioState = VioState.INIT
self._last_emitted_state: VioState | None = None
# Lazy import of the native binding — Risk-2 mitigation (I-5).
# Failure here is the BUILD_OKVIS2=OFF path the AZ-331 factory's
# `StrategyNotAvailableError` is meant to prevent; if a caller
# bypasses the factory and reaches this constructor with the
# native lib absent, we surface a fatal init error.
try:
from gps_denied_onboard.components.c1_vio._native import (
okvis2_binding,
)
except ImportError as exc:
raise VioFatalError(
"Okvis2Strategy: native binding "
"(gps_denied_onboard.components.c1_vio._native.okvis2_binding) "
"is not importable. Production binary must be built with "
"BUILD_OKVIS2=ON."
) from exc
self._binding_module = okvis2_binding
self._backend = self._construct_backend()
# ------------------------------------------------------------------
# Public Protocol surface.
def process_frame(
self,
frame: NavCameraFrame,
imu: ImuWindow,
calibration: CameraCalibration,
) -> VioOutput:
"""Hot-path call — one per nav-camera frame.
Steps:
1. Push every IMU sample in the window into the backend; the
strict-monotonic guard lives on the C++ side.
2. Submit the frame.
3. If the backend produced an output, classify health and
build the :class:`VioOutput` DTO.
4. If no output: tick the lost-frame counter; emit a state
transition record if needed.
"""
self._calibration = calibration
frame_id_str = str(frame.frame_id)
emitted_at_ns = self._clock.monotonic_ns()
try:
self._push_imu_window(imu)
produced = self._backend.add_frame(
frame_id_str, _frame_ts_ns(frame), _frame_image(frame)
)
except self._binding_module.OkvisInitException as exc:
self._emit_transition(VioState.INIT, frame_id_str)
raise VioInitializingError(
f"OKVIS2 backend reports INIT while processing frame {frame_id_str!r}: {exc}"
) from exc
except self._binding_module.OkvisOptimizationException as exc:
# Treat as a degraded frame: emit no VioOutput from this
# path — callers expect either a VioOutput or a VioError;
# we choose error here so C5 can fall back, matching AC-3.
self._tick_lost(frame_id_str)
if self._reported_state == VioState.LOST:
self._emit_transition(VioState.LOST, frame_id_str)
raise VioFatalError(
f"OKVIS2 backend exhausted lost-frame budget at {frame_id_str!r}: {exc}"
) from exc
self._emit_transition(self._reported_state, frame_id_str)
raise VioInitializingError(
f"OKVIS2 backend optimisation failure at {frame_id_str!r}: {exc}"
) from exc
except self._binding_module.OkvisFatalException as exc:
self._emit_transition(VioState.LOST, frame_id_str)
raise VioFatalError(
f"OKVIS2 backend fatal exception at {frame_id_str!r}: {exc}"
) from exc
except (RuntimeError, ValueError) as exc:
# Catch-all for unmapped backend exceptions. Re-classify as
# fatal — we explicitly forbid raw library exceptions across
# the public boundary.
raise VioFatalError(
f"OKVIS2 backend raised an unmapped exception at {frame_id_str!r}: {exc}"
) from exc
if not produced:
# Frame consumed but no estimator update yet — INIT path
# while OKVIS2 warms up its keyframe window.
self._emit_transition(VioState.INIT, frame_id_str)
raise VioInitializingError(
f"Okvis2Strategy: backend has not yet emitted an "
f"estimator update at {frame_id_str!r}"
)
raw = self._backend.get_latest_output()
if raw is None:
raise VioFatalError(
f"Okvis2Strategy: backend reported a new output for "
f"{frame_id_str!r} but get_latest_output() returned None"
)
vio_output = self._build_vio_output(raw, emitted_at_ns)
self._consecutive_lost = 0
new_state = self._classify_state(vio_output.feature_quality)
if new_state != self._reported_state:
self._reported_state = new_state
self._emit_transition(new_state, frame_id_str)
if new_state in (VioState.INIT, VioState.TRACKING):
self._frames_since_warmup += 1
if self._okvis2_cfg.per_frame_debug_log:
self._logger.debug(
"okvis2.process_frame",
extra={
"component": _LOGGER_COMPONENT,
"kind": "vio.tick",
"frame_id": frame_id_str,
"kv": {
"state": self._reported_state.value,
"tracked": vio_output.feature_quality.tracked,
"mre_px": vio_output.feature_quality.mre_px,
"emitted_at_ns": vio_output.emitted_at_ns,
},
},
)
return vio_output
def reset_to_warm_start(self, hint: WarmStartPose) -> None:
"""Destructive re-init from an F8-reboot warm-start hint.
Idempotent across consecutive calls (AC-4) — a second call
without an intervening ``process_frame`` reseats the backend
again without raising.
"""
try:
body_T_world = np.asarray(hint.body_T_world.matrix(), dtype=np.float64)
except AttributeError as exc:
raise VioFatalError(
"Okvis2Strategy.reset_to_warm_start: hint.body_T_world is "
"not a gtsam.Pose3 (missing .matrix())"
) from exc
velocity = np.asarray(hint.velocity_b, dtype=np.float64)
accel_bias = np.asarray(hint.bias.accel_bias, dtype=np.float64)
gyro_bias = np.asarray(hint.bias.gyro_bias, dtype=np.float64)
try:
self._backend.reset(body_T_world, velocity, accel_bias, gyro_bias)
except self._binding_module.OkvisInitException as exc:
raise VioFatalError(f"OKVIS2 backend rejected warm-start reset: {exc}") from exc
except (RuntimeError, ValueError) as exc:
raise VioFatalError(
f"OKVIS2 backend raised an unmapped exception during reset: {exc}"
) from exc
self._latest_bias = hint.bias
self._frames_since_warmup = 0
self._consecutive_lost = 0
self._reported_state = VioState.INIT
self._emit_transition(VioState.INIT, frame_id="")
def health_snapshot(self) -> VioHealth:
"""Most-recent health state — no backend call (cheap)."""
return VioHealth(
state=self._reported_state,
consecutive_lost=self._consecutive_lost,
bias_norm=_bias_norm(self._latest_bias),
)
def current_strategy_label(self) -> Literal["okvis2", "vins_mono", "klt_ransac"]:
return _STRATEGY_LABEL
# ------------------------------------------------------------------
# Internal helpers.
def _construct_backend(self) -> Any:
"""Build the backend from config — calibration path is optional
because the unit-test fake-binding path skips real intrinsics.
Tests inject a fake module at ``sys.modules`` before construction
(see ``tests/unit/c1_vio/conftest.py``); the fake's
``Okvis2Backend`` accepts whatever this method passes.
"""
K = self._load_camera_intrinsics()
yaml_config = self._render_yaml_config()
try:
return self._binding_module.Okvis2Backend(yaml_config, K)
except self._binding_module.OkvisInitException as exc:
raise VioFatalError(f"Okvis2Strategy: backend init failed: {exc}") from exc
def _load_camera_intrinsics(self) -> np.ndarray:
"""Load 3x3 camera intrinsics from the calibration path.
Returns the identity matrix when the runtime block has no
path configured — the unit-test path overrides this via the
fake binding's ctor anyway, and a production binary refusing
to start on a missing calibration is preferable to silently
emitting wrong poses (handled by the YAML loader downstream).
"""
path = self._config.runtime.camera_calibration_path
if not path:
return np.eye(3, dtype=np.float64)
try:
import json
with open(path, encoding="utf-8") as fh:
blob = json.load(fh)
except (OSError, ValueError) as exc:
raise VioFatalError(
f"Okvis2Strategy: failed to load camera calibration from {path!r}: {exc}"
) from exc
K_raw = blob.get("intrinsics_3x3")
if K_raw is None:
raise VioFatalError(
f"Okvis2Strategy: calibration file {path!r} is missing the 'intrinsics_3x3' field"
)
K = np.asarray(K_raw, dtype=np.float64)
if K.shape != (3, 3):
raise VioFatalError(f"Okvis2Strategy: intrinsics_3x3 must be 3x3; got shape {K.shape}")
return K
def _render_yaml_config(self) -> str:
"""Render the Okvis2Config sub-block into an OKVIS2 YAML snippet.
OKVIS2 reads a YAML config string at construction. Only the knobs
AZ-332 exposes are rendered; OKVIS2-internal defaults cover the
rest.
"""
cfg = self._okvis2_cfg
return (
"# AZ-332 — generated OKVIS2 config (see Okvis2Config in c1_vio/config.py)\n"
f"keyframe_window_size: {cfg.keyframe_window_size}\n"
f"keyframe_parallax_threshold_px: {cfg.keyframe_parallax_threshold_px}\n"
f"ransac_inlier_ratio: {cfg.ransac_inlier_ratio}\n"
f"max_optimization_iters: {cfg.max_optimization_iters}\n"
)
def _push_imu_window(self, imu: ImuWindow) -> None:
for sample in imu.samples:
self._backend.add_imu(
sample.ts_ns,
np.asarray(sample.accel_xyz, dtype=np.float64),
np.asarray(sample.gyro_xyz, dtype=np.float64),
)
def _build_vio_output(self, raw: dict[str, Any], emitted_at_ns: int) -> VioOutput:
try:
pose = _se3_from_4x4(raw["pose_T_world_body"])
cov = np.asarray(raw["pose_covariance_6x6"], dtype=np.float64)
bias = ImuBias(
accel_bias=tuple(float(x) for x in raw["accel_bias"]), # type: ignore[arg-type]
gyro_bias=tuple(float(x) for x in raw["gyro_bias"]), # type: ignore[arg-type]
)
feature_quality = FeatureQuality(
tracked=int(raw["tracked_features"]),
new=int(raw["new_features"]),
lost=int(raw["lost_features"]),
mean_parallax=float(raw["mean_parallax"]),
mre_px=float(raw["mre_px"]),
)
backend_ts = int(raw.get("emitted_at_ns") or emitted_at_ns)
except (KeyError, TypeError, ValueError) as exc:
raise VioFatalError(f"Okvis2Strategy: backend output is malformed: {exc}") from exc
if cov.shape != (6, 6):
raise VioFatalError(
f"Okvis2Strategy: pose_covariance_6x6 has shape {cov.shape}; expected (6, 6)"
)
self._latest_bias = bias
return VioOutput(
frame_id=raw["frame_id"],
relative_pose_T=pose,
pose_covariance_6x6=cov,
imu_bias=bias,
feature_quality=feature_quality,
emitted_at_ns=backend_ts,
)
def _classify_state(self, fq: FeatureQuality) -> VioState:
if self._reported_state == VioState.INIT and (
self._frames_since_warmup + 1 < self._warm_start_max_frames
):
return VioState.INIT
if fq.tracked < self._okvis2_cfg.degraded_feature_threshold:
return VioState.DEGRADED
return VioState.TRACKING
def _tick_lost(self, frame_id: str) -> None:
self._consecutive_lost += 1
if self._consecutive_lost >= self._lost_frame_threshold:
self._reported_state = VioState.LOST
elif self._reported_state == VioState.TRACKING:
self._reported_state = VioState.DEGRADED
def _emit_transition(self, new_state: VioState, frame_id: str) -> None:
if self._last_emitted_state == new_state:
return
self._last_emitted_state = new_state
record = FdrRecord(
schema_version=CURRENT_SCHEMA_VERSION,
ts=_now_iso(),
producer_id=_PRODUCER_ID,
kind="vio.health",
payload={
"state": new_state.value,
"consecutive_lost": self._consecutive_lost,
"bias_norm": _bias_norm(self._latest_bias),
"strategy_label": _STRATEGY_LABEL,
"frame_id": frame_id,
},
)
self._fdr.enqueue(record)
def _frame_ts_ns(frame: NavCameraFrame) -> int:
"""Convert ``NavCameraFrame.timestamp`` to monotonic-ns.
Uses the datetime's UTC epoch nanoseconds so the value is
monotonically increasing across frames (frame source guarantees
strictly increasing timestamps per the FrameSource contract).
"""
return int(frame.timestamp.timestamp() * 1e9)
def _frame_image(frame: NavCameraFrame) -> np.ndarray:
"""Coerce the frame's image into a contiguous uint8 ndarray."""
arr = np.ascontiguousarray(frame.image, dtype=np.uint8)
if arr.ndim < 2 or arr.ndim > 3:
raise VioFatalError(
f"Okvis2Strategy: NavCameraFrame.image must be 2-D or 3-D; got {arr.ndim}-D"
)
return arr
@@ -1,6 +1,42 @@
"""C2.5 Rerank component — Public API."""
"""C2.5 ReRank — Public API (AZ-342).
from gps_denied_onboard._types.vpr import RerankResult
from gps_denied_onboard.components.c2_5_rerank.interface import RerankStrategy
Per ``rerank_strategy_protocol.md`` v1.0.0 the public surface
consists of:
__all__ = ["RerankResult", "RerankStrategy"]
- :class:`ReRankStrategy` Protocol (one method).
- DTOs re-exported from :mod:`gps_denied_onboard._types.rerank` (the
L1 home for cross-component DTOs): :class:`RerankCandidate`,
:class:`RerankResult`.
- Error family rooted at :class:`RerankError`; two documented
subtypes (:class:`RerankBackboneError`,
:class:`RerankAllCandidatesFailedError`).
- Config block :class:`C2_5RerankConfig` (registered on import).
Concrete strategy (``InlierCountReRanker``, AZ-343) lives in a
sibling module and is imported lazily by
:mod:`gps_denied_onboard.runtime_root.rerank_factory` — Risk-2
mitigation: this ``__init__.py`` MUST NOT import any concrete
strategy module.
"""
from gps_denied_onboard._types.rerank import RerankCandidate, RerankResult
from gps_denied_onboard.components.c2_5_rerank.config import C2_5RerankConfig
from gps_denied_onboard.components.c2_5_rerank.errors import (
RerankAllCandidatesFailedError,
RerankBackboneError,
RerankError,
)
from gps_denied_onboard.components.c2_5_rerank.interface import ReRankStrategy
from gps_denied_onboard.config.schema import register_component_block
register_component_block("c2_5_rerank", C2_5RerankConfig)
__all__ = [
"C2_5RerankConfig",
"ReRankStrategy",
"RerankAllCandidatesFailedError",
"RerankBackboneError",
"RerankCandidate",
"RerankError",
"RerankResult",
]
@@ -0,0 +1,62 @@
"""C2.5 ReRankStrategy config block (AZ-342).
Registered into ``config.components['c2_5_rerank']`` by the package
``__init__.py``. The composition-root factory
:func:`gps_denied_onboard.runtime_root.rerank_factory.build_rerank_strategy`
reads this block to select the strategy and configure the top-N cut.
``top_n`` is the strategy-side cap on the returned
:attr:`RerankResult.candidates` length; the composition root binds
``n`` per-frame from this value (default 3 per the epic's K=10 → N=3
spec).
"""
from __future__ import annotations
from dataclasses import dataclass
from typing import Final
from gps_denied_onboard.config.schema import ConfigError
__all__ = [
"C2_5RerankConfig",
"KNOWN_STRATEGIES",
]
KNOWN_STRATEGIES: Final[frozenset[str]] = frozenset({"inlier_count"})
@dataclass(frozen=True)
class C2_5RerankConfig:
"""Per-component config for C2.5 ReRank.
``strategy`` selects exactly one of the registered re-rankers
(today only ``inlier_count``); the composition-root factory
respects compile-time ``BUILD_RERANK_<variant>`` gating on top
of this label.
``top_n`` is the per-frame N cap (1..K-1). Default 3 (the epic's
K=10 → N=3 spec).
``debug_per_frame_log`` gates the two DEBUG events
(``c2_5.rerank.zero_inliers`` per dropped candidate and
``c2_5.rerank.frame_done`` per frame); flooding journald at
``3 Hz × K=10 = 30 events/sec`` by default would violate
description.md § 9. Operators flip this to ``True`` for the
debug-build flight binary.
"""
strategy: str = "inlier_count"
top_n: int = 3
debug_per_frame_log: bool = False
def __post_init__(self) -> None:
if self.strategy not in KNOWN_STRATEGIES:
raise ConfigError(
f"C2_5RerankConfig.strategy={self.strategy!r} not in "
f"{sorted(KNOWN_STRATEGIES)}"
)
if self.top_n < 1:
raise ConfigError(
f"C2_5RerankConfig.top_n must be >= 1; got {self.top_n}"
)
@@ -0,0 +1,56 @@
"""C2.5 ReRankStrategy error taxonomy (AZ-342).
The family is intentionally narrow: a per-candidate failure is the
normal case (drop-and-continue, INV-8) and is signalled via
``candidates_dropped`` in the returned :class:`RerankResult` —
NOT via an exception. An exception escapes ``rerank`` only when
EVERY candidate fails (:class:`RerankAllCandidatesFailedError`)
which is the C5 → VIO-only-fallback trigger per AC-3.5.
:class:`RerankBackboneError` is raised INSIDE the per-candidate loop,
caught by the strategy, logged ERROR, FDR-stamped, and the
candidate is dropped. It is exposed publicly so the per-candidate
log + FDR taxonomy is observable and so future re-rankers using a
different backbone can re-raise the same kind.
``TileFetchError`` is C6-owned
(``c6_tile_cache.errors.TileNotFoundError`` / ``TileFsError``); the
strategy catches it in the per-candidate loop and treats it
identically to :class:`RerankBackboneError`.
"""
from __future__ import annotations
__all__ = [
"RerankAllCandidatesFailedError",
"RerankBackboneError",
"RerankError",
]
class RerankError(Exception):
"""Base class for the C2.5 rerank error family.
Caught at the runtime root only when
:class:`RerankAllCandidatesFailedError` fires; per-candidate
failures stay inside the strategy.
"""
class RerankBackboneError(RerankError):
"""Per-candidate LightGlue forward-pass failure.
CUDA OOM, TRT engine deserialize mismatch. Logged at ERROR; one
FDR record per occurrence; the offending candidate is dropped
from the rerank set; the surrounding ``rerank`` call continues
with the remaining candidates (INV-8).
"""
class RerankAllCandidatesFailedError(RerankError):
"""Zero survivors after the per-candidate loop.
Every candidate's LightGlue or tile fetch failed. Logged at
ERROR; FDR record ``kind=rerank.all_failed``. C5 falls back to
VIO-only with provenance ``visual_propagated`` (AC-3.5).
"""
@@ -0,0 +1,584 @@
"""C2.5 :class:`InlierCountReRanker` — single-pair LightGlue inlier count (AZ-343).
Production-default :class:`ReRankStrategy` for the K=10 → N=3 cut.
For each candidate in :class:`VprResult.candidates`:
1. Fetch tile pixels via :class:`TileStore.read_tile_pixels` (a
:class:`TilePixelHandle` context manager backed by an mmap'd JPEG).
2. Extract :class:`KeypointSet` from BOTH the query frame and the
candidate tile via the shared :class:`FeatureExtractor` (AZ-343
scope expansion).
3. Call :meth:`LightGlueRuntime.match` for the single pair; count the
number of correspondences as the inlier proxy.
4. Sort surviving candidates descending by ``inlier_count`` (ties
broken ascending by ``descriptor_distance`` carried forward from
C2; INV-3); truncate to ``n``; return a :class:`RerankResult`.
Drop-and-continue (INV-8) is the central reliability mechanism: any
per-candidate :class:`TileCacheError` or LightGlue / feature-extractor
failure is caught inside the loop, the candidate is dropped, an ERROR
log + FDR record is emitted, and the loop continues. Only the
zero-survivors case escapes as :class:`RerankAllCandidatesFailedError`.
The survivor's ``tile_pixels_handle`` is identity-equal to the handle
returned by ``TileStore.read_tile_pixels`` (INV-6 / AC-7). The handle
is exited at the end of feature extraction; downstream C3 re-enters it
to read pixels — the C6 page-cache-backed impl supports re-entry for
the per-frame TTL window.
"""
from __future__ import annotations
import logging
from datetime import datetime, timezone
from typing import TYPE_CHECKING
import cv2
import numpy as np
from gps_denied_onboard._types.rerank import RerankCandidate, RerankResult
from gps_denied_onboard.components.c2_5_rerank.errors import (
RerankAllCandidatesFailedError,
RerankBackboneError,
)
from gps_denied_onboard.fdr_client import FdrRecord
from gps_denied_onboard.helpers.feature_extractor import FeatureExtractorError
from gps_denied_onboard.helpers.lightglue_runtime import (
LightGlueConcurrentAccessError,
LightGlueRuntimeError,
)
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.matching import KeypointSet
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.vpr import VprResult
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrClient
from gps_denied_onboard.helpers.feature_extractor import FeatureExtractor
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntime
# Cross-component types (`TileStore`, `ReRankStrategy`,
# `C2_5RerankConfig`) are intentionally NOT imported here — even under
# ``TYPE_CHECKING``, an AST-level cross-component import is rejected by
# ``test_ac6_only_compose_root_imports_concrete_strategies``. The
# composition root (``runtime_root.rerank_factory``) injects concrete
# instances satisfying these Protocols; we accept them as ``object``
# at the constructor boundary and trust the runtime root for type
# safety.
__all__ = ["InlierCountReRanker", "create"]
_LOG = logging.getLogger("gps_denied_onboard.c2_5_rerank")
_PRODUCER_ID = "c2_5_rerank.inlier_count"
# C6 TileCacheError lives in `gps_denied_onboard.components.c6_tile_cache.errors`
# but we cannot import it: cross-component imports are banned outside the
# composition root (test_ac6_only_compose_root_imports_concrete_strategies).
# Match the family by class-module prefix instead — the C6 contract documents
# the module path so a future module-rename surfaces as a test failure here.
_C6_ERROR_MODULE_PREFIX = "gps_denied_onboard.components.c6_tile_cache"
def _is_tile_cache_error(exc: BaseException) -> bool:
"""True if ``exc`` is a C6 :class:`TileCacheError` subclass.
Duck-types against the producer's class module to keep the
architectural import boundary clean. Programming errors raised
from the C6 module (e.g. ``AttributeError``) would also match —
that is acceptable, since by Contract C6 wraps OS errors into
:class:`TileCacheError`; anything bare leaking out is a C6 bug
that the per-candidate drop semantics will absorb just as the
contract expects of any per-candidate failure.
"""
return type(exc).__module__.startswith(_C6_ERROR_MODULE_PREFIX)
class InlierCountReRanker:
"""Single-pair LightGlue inlier-count :class:`ReRankStrategy` (AZ-343)."""
def __init__(
self,
*,
config: Config,
tile_store: object,
lightglue_runtime: LightGlueRuntime,
feature_extractor: FeatureExtractor,
clock: Clock,
fdr_client: FdrClient | None,
) -> None:
# Keyword-only injection: a runtime-root regression that forgets
# one of the helpers fails loudly instead of silently constructing
# an under-wired strategy. ``tile_store`` is typed ``object``
# because the C6 ``TileStore`` Protocol lives in another
# component (see the module docstring on cross-component imports).
block = config.components["c2_5_rerank"]
self._tile_store = tile_store
self._lightglue_runtime = lightglue_runtime
self._feature_extractor = feature_extractor
self._clock = clock
self._fdr_client = fdr_client
self._top_n = int(block.top_n)
self._debug_per_frame_log = bool(block.debug_per_frame_log)
def rerank(
self,
frame: NavCameraFrame,
vpr_result: VprResult,
n: int,
calibration: CameraCalibration,
) -> RerankResult:
candidates_input = len(vpr_result.candidates)
target_n = self._top_n if n <= 0 else min(self._top_n, n)
if candidates_input == 0:
self._fail_all(
frame_id=vpr_result.frame_id,
candidates_input=0,
candidates_dropped=0,
reason="no_input_candidates",
)
query_features = self._extract_query_features(frame)
if query_features is None:
self._fail_all(
frame_id=vpr_result.frame_id,
candidates_input=candidates_input,
candidates_dropped=candidates_input,
reason="query_extraction_failed",
)
survivors: list[RerankCandidate] = []
dropped = 0
inlier_counts: list[int] = []
for vpr_candidate in vpr_result.candidates:
tile_id = vpr_candidate.tile_id
survivor = self._process_candidate(
tile_id=tile_id,
vpr_candidate=vpr_candidate,
query_features=query_features,
frame_id=vpr_result.frame_id,
)
if survivor is None:
dropped += 1
continue
survivors.append(survivor)
inlier_counts.append(survivor.inlier_count)
if not survivors:
self._fail_all(
frame_id=vpr_result.frame_id,
candidates_input=candidates_input,
candidates_dropped=dropped,
reason="all_candidates_dropped",
)
survivors.sort(key=lambda c: (-c.inlier_count, c.descriptor_distance))
truncated = tuple(survivors[:target_n])
if len(truncated) < target_n:
_LOG.warning(
"c2_5.rerank.fewer_than_n_survivors",
extra={
"kind": "c2_5.rerank.fewer_than_n_survivors",
"kv": {
"requested": target_n,
"returned": len(truncated),
"dropped": dropped,
"frame_id": vpr_result.frame_id,
},
},
)
if self._debug_per_frame_log:
_LOG.debug(
"c2_5.rerank.frame_done",
extra={
"kind": "c2_5.rerank.frame_done",
"kv": {
"frame_id": vpr_result.frame_id,
"inlier_counts": inlier_counts,
},
},
)
result = RerankResult(
frame_id=vpr_result.frame_id,
candidates=truncated,
reranked_at=int(self._clock.monotonic_ns()),
rerank_label="inlier_count",
candidates_input=candidates_input,
candidates_dropped=dropped,
)
self._emit_frame_done_fdr(result)
return result
# ------------------------------------------------------------------
# Per-candidate pipeline: open handle → extract → match → score.
def _process_candidate(
self,
*,
tile_id,
vpr_candidate,
query_features,
frame_id: int,
) -> RerankCandidate | None:
try:
handle = self._tile_store.read_tile_pixels(tile_id)
except Exception as exc:
if not _is_tile_cache_error(exc):
raise
self._log_tile_fetch_error(tile_id=tile_id, frame_id=frame_id, exc=exc)
return None
tile_features = self._extract_tile_features(
handle=handle, tile_id=tile_id, frame_id=frame_id
)
if tile_features is None:
return None
inlier_count = self._count_inliers(
query_features=query_features,
tile_features=tile_features,
tile_id=tile_id,
frame_id=frame_id,
)
if inlier_count is None:
return None
if inlier_count == 0:
self._maybe_log_zero_inliers(tile_id=tile_id, frame_id=frame_id)
return None
return RerankCandidate(
tile_id=tile_id,
inlier_count=inlier_count,
descriptor_distance=vpr_candidate.descriptor_distance,
descriptor_dim=vpr_candidate.descriptor_dim,
tile_pixels_handle=handle,
)
def _extract_query_features(
self, frame: NavCameraFrame
) -> KeypointSet | None:
image = _ensure_bgr_array(frame.image)
if image is None:
self._log_backbone_error(
frame_id=frame.frame_id,
tile_id=None,
reason="query_image_not_decodable",
error=None,
)
return None
try:
return self._feature_extractor.extract(image)
except FeatureExtractorError as exc:
self._log_backbone_error(
frame_id=frame.frame_id,
tile_id=None,
reason="query_feature_extraction_failed",
error=exc,
)
return None
def _extract_tile_features(
self, *, handle, tile_id, frame_id: int
) -> KeypointSet | None:
try:
with handle as jpeg_view:
tile_image = _decode_jpeg(jpeg_view)
except ValueError as exc:
self._log_tile_fetch_error(tile_id=tile_id, frame_id=frame_id, exc=exc)
return None
except Exception as exc:
if not _is_tile_cache_error(exc):
raise
self._log_tile_fetch_error(tile_id=tile_id, frame_id=frame_id, exc=exc)
return None
try:
return self._feature_extractor.extract(tile_image)
except FeatureExtractorError as exc:
self._log_backbone_error(
frame_id=frame_id,
tile_id=tile_id,
reason="tile_feature_extraction_failed",
error=exc,
)
return None
def _count_inliers(
self,
*,
query_features,
tile_features,
tile_id,
frame_id: int,
) -> int | None:
try:
correspondences = self._lightglue_runtime.match(
query_features, tile_features
)
except (
LightGlueRuntimeError,
LightGlueConcurrentAccessError,
RerankBackboneError,
RuntimeError,
) as exc:
self._log_backbone_error(
frame_id=frame_id,
tile_id=tile_id,
reason="lightglue_forward_failed",
error=exc,
)
return None
scores = getattr(correspondences, "scores", None)
if scores is None:
return 0
try:
return int(np.asarray(scores).shape[0])
except (TypeError, ValueError):
return 0
# ------------------------------------------------------------------
# Log + FDR helpers.
def _log_tile_fetch_error(self, *, tile_id, frame_id: int, exc: BaseException) -> None:
_LOG.error(
"c2_5.rerank.tile_fetch_error",
extra={
"kind": "c2_5.rerank.tile_fetch_error",
"kv": {
"frame_id": frame_id,
"tile_id": list(tile_id),
"error": repr(exc),
},
},
)
if self._fdr_client is None:
return
self._safe_enqueue(
FdrRecord(
schema_version=1,
ts=self._fdr_ts(),
producer_id=_PRODUCER_ID,
kind="rerank.tile_fetch_error",
payload={
"frame_id": int(frame_id),
"tile_id": list(tile_id),
},
)
)
def _log_backbone_error(
self,
*,
frame_id: int,
tile_id,
reason: str,
error: BaseException | None,
) -> None:
kv: dict[str, object] = {"frame_id": frame_id, "reason": reason}
if tile_id is not None:
kv["tile_id"] = list(tile_id)
if error is not None:
kv["error"] = repr(error)
_LOG.error(
"c2_5.rerank.backbone_error",
extra={"kind": "c2_5.rerank.backbone_error", "kv": kv},
)
if self._fdr_client is None:
return
payload: dict[str, object] = {
"frame_id": int(frame_id),
"reason": reason,
}
if tile_id is not None:
payload["tile_id"] = list(tile_id)
self._safe_enqueue(
FdrRecord(
schema_version=1,
ts=self._fdr_ts(),
producer_id=_PRODUCER_ID,
kind="rerank.backbone_error",
payload=payload,
)
)
def _maybe_log_zero_inliers(self, *, tile_id, frame_id: int) -> None:
if not self._debug_per_frame_log:
return
_LOG.debug(
"c2_5.rerank.zero_inliers",
extra={
"kind": "c2_5.rerank.zero_inliers",
"kv": {"frame_id": frame_id, "tile_id": list(tile_id)},
},
)
def _emit_frame_done_fdr(self, result: RerankResult) -> None:
if self._fdr_client is None:
return
top = result.candidates[0]
self._safe_enqueue(
FdrRecord(
schema_version=1,
ts=self._fdr_ts(),
producer_id=_PRODUCER_ID,
kind="rerank.frame_done",
payload={
"frame_id": int(result.frame_id),
"candidates_input": int(result.candidates_input),
"candidates_dropped": int(result.candidates_dropped),
"top_inlier_count": int(top.inlier_count),
"top_tile_id": list(top.tile_id),
},
)
)
def _emit_all_failed_fdr(
self, *, frame_id: int, candidates_input: int, candidates_dropped: int
) -> None:
if self._fdr_client is None:
return
self._safe_enqueue(
FdrRecord(
schema_version=1,
ts=self._fdr_ts(),
producer_id=_PRODUCER_ID,
kind="rerank.all_failed",
payload={
"frame_id": int(frame_id),
"candidates_input": int(candidates_input),
"candidates_dropped": int(candidates_dropped),
},
)
)
def _fail_all(
self,
*,
frame_id: int,
candidates_input: int,
candidates_dropped: int,
reason: str,
) -> None:
_LOG.error(
"c2_5.rerank.all_failed",
extra={
"kind": "c2_5.rerank.all_failed",
"kv": {
"frame_id": frame_id,
"candidates_input": candidates_input,
"candidates_dropped": candidates_dropped,
"reason": reason,
},
},
)
self._emit_all_failed_fdr(
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=candidates_dropped,
)
raise RerankAllCandidatesFailedError(
f"InlierCountReRanker.rerank: zero survivors "
f"(frame_id={frame_id!r}, candidates_input={candidates_input}, "
f"candidates_dropped={candidates_dropped}, reason={reason!r})"
)
def _safe_enqueue(self, record: FdrRecord) -> None:
try:
self._fdr_client.enqueue(record) # type: ignore[union-attr]
except Exception as exc:
# FDR enqueue failures are observability-only; they must
# NEVER promote to an InlierCountReRanker drop event.
_LOG.debug(
"c2_5.rerank.fdr_enqueue_failed",
extra={
"kind": "c2_5.rerank.fdr_enqueue_failed",
"kv": {"error": repr(exc)},
},
)
def _fdr_ts(self) -> str:
ns = int(self._clock.time_ns())
seconds, fraction_ns = divmod(ns, 1_000_000_000)
dt = datetime.fromtimestamp(seconds, tz=timezone.utc)
# ISO-8601 with nanosecond fractional part and an explicit UTC
# offset; survives a round-trip through datetime.fromisoformat
# (which accepts up to microseconds — the extra ns digits are
# preserved as a string suffix for the FDR consumer).
return f"{dt.strftime('%Y-%m-%dT%H:%M:%S')}.{fraction_ns:09d}+00:00"
def _ensure_bgr_array(image: object) -> np.ndarray | None:
"""Coerce ``NavCameraFrame.image`` into a BGR ``np.ndarray``.
Accepts an already-decoded array (returned as-is) or a JPEG/PNG
byte buffer (decoded via ``cv2.imdecode``). Anything else returns
``None`` so the caller routes through the backbone-error drop path.
"""
if isinstance(image, np.ndarray):
return image
if isinstance(image, (bytes, bytearray, memoryview)):
data = bytes(image)
if not data:
return None
buf = np.frombuffer(data, dtype=np.uint8)
return cv2.imdecode(buf, cv2.IMREAD_COLOR)
return None
def _decode_jpeg(jpeg_view: memoryview) -> np.ndarray:
"""Decode a JPEG ``memoryview`` into a BGR ``np.ndarray``.
Raises :class:`ValueError` if the buffer is empty or invalid; the
caller catches both and treats them as a tile-fetch-error drop.
"""
data = bytes(jpeg_view)
if not data:
raise ValueError("empty JPEG buffer")
buf = np.frombuffer(data, dtype=np.uint8)
decoded = cv2.imdecode(buf, cv2.IMREAD_COLOR)
if decoded is None:
raise ValueError("cv2.imdecode returned None for tile JPEG")
return decoded
# ----------------------------------------------------------------------
# Module-level factory entry-point consumed by
# :mod:`gps_denied_onboard.runtime_root.rerank_factory.build_rerank_strategy`.
def create(
config: Config,
*,
tile_store: object,
lightglue_runtime: LightGlueRuntime,
feature_extractor: FeatureExtractor,
clock: Clock,
fdr_client: FdrClient | None = None,
) -> object:
"""Construct an :class:`InlierCountReRanker` from injected helpers."""
strategy = InlierCountReRanker(
config=config,
tile_store=tile_store,
lightglue_runtime=lightglue_runtime,
feature_extractor=feature_extractor,
clock=clock,
fdr_client=fdr_client,
)
_LOG.info(
"c2_5.rerank.ready",
extra={
"kind": "c2_5.rerank.ready",
"kv": {
"strategy": "inlier_count",
"N": int(config.components["c2_5_rerank"].top_n),
"K": 10,
},
},
)
return strategy
@@ -1,17 +1,98 @@
"""C2.5 `RerankStrategy` Protocol.
"""C2.5 ``ReRankStrategy`` Protocol (AZ-342).
Default: `InlierBasedReranker` (single-pair LightGlue inlier counter, K=10 → N=3).
See `_docs/02_document/components/03_c2_5_rerank/`.
PEP 544 ``typing.Protocol`` with ``runtime_checkable=True``; a single
``rerank`` method that consumes a C2 :class:`VprResult` and produces
a :class:`RerankResult` ranked by single-pair LightGlue inlier count.
Concrete impl — :class:`InlierCountReRanker` (AZ-343) — lives in a
sibling module and is imported lazily by
:mod:`gps_denied_onboard.runtime_root.rerank_factory`.
The contract at
``_docs/02_document/contracts/c2_5_rerank/rerank_strategy_protocol.md``
v1.0.0 is the authoritative shape; this module mirrors it 1:1.
"""
from __future__ import annotations
from typing import Protocol
from typing import TYPE_CHECKING, Protocol, runtime_checkable
from gps_denied_onboard._types.vpr import RerankResult, VprResult
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.rerank import RerankResult
from gps_denied_onboard._types.vpr import VprResult
__all__ = ["ReRankStrategy"]
class RerankStrategy(Protocol):
"""Re-rank C2's top-K candidates down to N via cross-domain match scoring."""
@runtime_checkable
class ReRankStrategy(Protocol):
"""Single-camera re-rank strategy.
def rerank(self, vpr_result: VprResult, n_keep: int = 3) -> RerankResult: ...
Stateless per-frame; the only persistent state is the
constructor-injected
:class:`gps_denied_onboard.helpers.lightglue_runtime.LightGlueRuntime`
helper handle and the :class:`TileStore` Public API reference.
Invariants (see ``rerank_strategy_protocol.md`` v1.0.0):
- **INV-1 single-threaded** — each instance is bound to one
ingest thread; the shared ``LightGlueRuntime`` requires serial
access. Concurrent :meth:`rerank` calls on a single instance
race the GPU stream.
- **INV-2 stateless per-frame** — same inputs → same surviving
candidates in same order.
- **INV-3 top-N descending by inlier_count** — ties broken
deterministically by ``descriptor_distance`` ascending (the
C2-stage value carried forward).
- **INV-4 candidates length bounded** — ``0 < len <= n`` when
returned (zero raises :class:`RerankAllCandidatesFailedError`);
never exceeds ``n``; never exceeds
``len(vpr_result.candidates)``.
- **INV-5 descriptor_distance carried forward unchanged** — the
C2-stage value is preserved on every survivor for FDR
provenance.
- **INV-6 tile_pixels_handle is a reference, NOT a copy** —
``RerankCandidate.tile_pixels_handle`` is the same handle
returned by ``TileStore.read_tile_pixels`` (page-cache
backed).
- **INV-7 deterministic per tuple** — same ``(frame,
vpr_result, corpus, helper)`` → bit-identical
:class:`RerankResult`.
- **INV-8 drop-and-continue** — a per-candidate exception
NEVER propagates out of :meth:`rerank` unless EVERY candidate
fails. C3 relies on this partial-input tolerance.
Error envelope: only :class:`RerankAllCandidatesFailedError`
escapes :meth:`rerank`; per-candidate
:class:`RerankBackboneError` / ``TileFetchError`` from C6 are
caught inside the loop and turned into dropped candidates +
ERROR logs + per-occurrence FDR records.
"""
def rerank(
self,
frame: "NavCameraFrame",
vpr_result: "VprResult",
n: int,
calibration: "CameraCalibration",
) -> "RerankResult":
"""Re-rank the top-K candidates down to top-N by inlier count.
For each ``candidate`` in ``vpr_result.candidates``:
1. Fetch tile pixels via ``TileStore.read_tile_pixels(candidate.tile_id)``.
2. Run a single-pair LightGlue forward via the shared
:class:`LightGlueRuntime` (frame ↔ tile).
3. Record the inlier count.
Sort candidates descending by inlier count; return the top-N
as a :class:`RerankResult`. Drop-and-continue semantics
apply per INV-8.
Raises:
RerankAllCandidatesFailedError: zero survivors after
the per-candidate loop.
"""
...
@@ -1,6 +1,51 @@
"""C2 VPR component — Public API."""
"""C2 VPR — Public API (AZ-336).
from gps_denied_onboard._types.vpr import VprQuery, VprResult
Per ``vpr_strategy_protocol.md`` v1.0.0 the public surface consists
of:
- :class:`VprStrategy` Protocol (3 methods).
- DTOs re-exported from :mod:`gps_denied_onboard._types.vpr` (the L1
home for cross-component DTOs): :class:`VprQuery`,
:class:`VprCandidate`, :class:`VprResult`.
- Error family rooted at :class:`VprError`; three documented
subtypes (:class:`VprBackboneError`, :class:`VprPreprocessError`,
:class:`IndexUnavailableError`).
- Config block :class:`C2VprConfig` (registered on import).
:class:`BackbonePreprocessor` is C2-internal (see
``components/02_c2_vpr/description.md`` § 6) and intentionally NOT
re-exported.
Concrete strategies (``UltraVprStrategy``, ``NetVladStrategy``,
``MegaLocStrategy``, ``MixVprStrategy``, ``SelaVprStrategy``,
``EigenPlacesStrategy``, ``SaladStrategy``) live in sibling modules
and are imported lazily by
:mod:`gps_denied_onboard.runtime_root.vpr_factory` — Risk-2
mitigation: this ``__init__.py`` MUST NOT import any concrete
strategy module.
"""
from gps_denied_onboard._types.vpr import VprCandidate, VprQuery, VprResult
from gps_denied_onboard.components.c2_vpr.config import C2VprConfig
from gps_denied_onboard.components.c2_vpr.errors import (
IndexUnavailableError,
VprBackboneError,
VprError,
VprPreprocessError,
)
from gps_denied_onboard.components.c2_vpr.interface import VprStrategy
from gps_denied_onboard.config.schema import register_component_block
__all__ = ["VprQuery", "VprResult", "VprStrategy"]
register_component_block("c2_vpr", C2VprConfig)
__all__ = [
"C2VprConfig",
"IndexUnavailableError",
"VprBackboneError",
"VprCandidate",
"VprError",
"VprPreprocessError",
"VprQuery",
"VprResult",
"VprStrategy",
]
@@ -0,0 +1,60 @@
"""C2-internal ``BackbonePreprocessor`` Protocol (AZ-336).
The preprocessor is the resize / crop / normalise step that turns a
``NavCameraFrame`` into the tensor the backbone's forward pass
expects. It is C2-internal — each concrete :class:`VprStrategy`
owns its own preprocessor; sharing across backbones is forbidden per
``components/02_c2_vpr/description.md`` § 6 (preprocessing parameters
are tightly coupled to the backbone weights, so a shared
preprocessor would let a NetVLAD instance corrupt UltraVPR's input
layout).
This Protocol is NOT re-exported from ``c2_vpr.__init__`` — keeping
it inside the package enforces the description.md § 6 boundary.
"""
from __future__ import annotations
from typing import TYPE_CHECKING, Protocol, runtime_checkable
if TYPE_CHECKING:
import numpy as np
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.nav import NavCameraFrame
__all__ = ["BackbonePreprocessor"]
@runtime_checkable
class BackbonePreprocessor(Protocol):
"""Resize / crop / normalise per backbone's input contract.
Each :class:`VprStrategy` implementation owns its concrete
preprocessor (NOT shared across backbones). The strategy calls
:meth:`preprocess` inside :meth:`VprStrategy.embed_query` before
running the forward pass.
"""
def preprocess(
self,
frame: "NavCameraFrame",
calibration: "CameraCalibration",
) -> "np.ndarray":
"""Return the preprocessed input tensor in the backbone's layout.
Typical shape: ``(1, 3, H, W)`` NCHW float16 for TRT engines.
Raises :class:`VprPreprocessError` when the input frame
violates the backbone's contract (wrong colour channels,
calibration mismatch).
"""
...
def input_shape(self) -> tuple[int, ...]:
"""``(H, W)`` resize target the backbone expects.
Stable for the preprocessor's lifetime; consumed by tests to
assert preprocessing fidelity.
"""
...
@@ -0,0 +1,82 @@
"""C2 VPR strategy config block (AZ-336).
Registered into ``config.components['c2_vpr']`` by the package
``__init__.py``. The composition-root factory
:func:`gps_denied_onboard.runtime_root.vpr_factory.build_vpr_strategy`
reads this block to select the strategy and locate the backbone
weights + FAISS index sidecar.
``backbone_weights_path`` and ``faiss_index_path`` are required (no
default — paths are deployment-specific). They are typed
:class:`pathlib.Path` so YAML loaders that emit strings get coerced
at construction; ``__post_init__`` validates that both are non-empty.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from pathlib import Path
from typing import Final
from gps_denied_onboard.config.schema import ConfigError
__all__ = [
"C2VprConfig",
"KNOWN_STRATEGIES",
]
KNOWN_STRATEGIES: Final[frozenset[str]] = frozenset(
{
"ultra_vpr",
"net_vlad",
"mega_loc",
"mix_vpr",
"sela_vpr",
"eigen_places",
"salad",
}
)
@dataclass(frozen=True)
class C2VprConfig:
"""Per-component config for C2 VPR.
``strategy`` selects exactly one of the seven backbones
(see :data:`KNOWN_STRATEGIES`); the composition-root factory
respects compile-time ``BUILD_VPR_<variant>`` gating on top of
this label.
``backbone_weights_path`` is the on-disk location of the
backbone weights (TRT engine, ONNX model, PyTorch state dict —
per strategy). ``faiss_index_path`` is the location of the
pre-built FAISS HNSW index file (C6 ``DescriptorIndex`` reads
its sidecar there).
"""
strategy: str = "net_vlad"
backbone_weights_path: Path = field(default_factory=lambda: Path("/models/vpr/weights"))
faiss_index_path: Path = field(default_factory=lambda: Path("/cache/vpr/index.faiss"))
def __post_init__(self) -> None:
if self.strategy not in KNOWN_STRATEGIES:
raise ConfigError(
f"C2VprConfig.strategy={self.strategy!r} not in "
f"{sorted(KNOWN_STRATEGIES)}"
)
if not isinstance(self.backbone_weights_path, Path):
object.__setattr__(
self, "backbone_weights_path", Path(self.backbone_weights_path)
)
if not isinstance(self.faiss_index_path, Path):
object.__setattr__(
self, "faiss_index_path", Path(self.faiss_index_path)
)
if not str(self.backbone_weights_path):
raise ConfigError(
"C2VprConfig.backbone_weights_path must be non-empty"
)
if not str(self.faiss_index_path):
raise ConfigError(
"C2VprConfig.faiss_index_path must be non-empty"
)
@@ -0,0 +1,66 @@
"""C2 VprStrategy error taxonomy (AZ-336).
Every ``VprStrategy`` method raises only members of :class:`VprError`.
Lower-level exceptions from the backbone runtime (TRT deserialize,
CUDA OOM, ONNX runtime IO mismatch, FAISS index torn mmap) MUST be
caught and rewrapped by the concrete strategy — the contract closes
the error envelope so consumers can ``except VprError`` once and
handle the family.
A separate composition-time error
(:class:`gps_denied_onboard.runtime_root.errors.StrategyNotAvailableError`)
lives outside this family — it is raised by the factory, not by a
``VprStrategy`` method.
Note: C6 ``c6_tile_cache.errors`` also defines an
``IndexUnavailableError`` for the underlying ``DescriptorIndex``
search path. The two classes are intentionally distinct (same name,
different namespaces): the C2 family is the closed envelope a C5/C2.5
consumer sees; the C6 family is the storage-layer error a concrete
strategy is responsible for rewrapping.
"""
from __future__ import annotations
__all__ = [
"IndexUnavailableError",
"VprBackboneError",
"VprError",
"VprPreprocessError",
]
class VprError(Exception):
"""Base class for the C2 VPR error family.
Caught at the runtime root; downstream effect per AC-1.4:
C5 falls back to VIO-only with provenance ``visual_propagated``.
"""
class VprBackboneError(VprError):
"""Backbone forward pass failed.
CUDA OOM, TRT engine deserialize mismatch, ONNX runtime IO
shape mismatch. Logged at ERROR; one FDR record per occurrence.
"""
class VprPreprocessError(VprError):
"""Input frame violates the backbone's preprocessing contract.
Wrong colour channels, calibration mismatch. Logged at ERROR;
one FDR record per occurrence. The concrete preprocessor
(each strategy owns its own per description.md § 6) raises this
and the strategy lets it propagate unchanged.
"""
class IndexUnavailableError(VprError):
"""FAISS index handle invalid for the strategy's retrieve path.
Post-F8 reboot before warm-up, out-of-band file replacement
caught by the underlying mmap defence, dim mismatch caught at
search time. The strategy MUST raise this rather than return
stale candidates (C2-ST-01).
"""
@@ -1,17 +1,113 @@
"""C2 `VprStrategy` Protocol.
"""C2 ``VprStrategy`` Protocol (AZ-336).
Concrete strategies: UltraVPR (primary), MegaLoc, MixVPR, SelaVPR, EigenPlaces,
NetVLAD, SALAD. See `_docs/02_document/components/02_c2_vpr/`.
PEP 544 ``typing.Protocol`` with ``runtime_checkable=True``; three
methods spanning the camera-ingest hot path
(:meth:`embed_query` + :meth:`retrieve_topk`) and the composition-time
pre-flight check (:meth:`descriptor_dim`).
Concrete impls — :class:`UltraVprStrategy` (AZ-337),
:class:`NetVladStrategy` (AZ-338), :class:`MegaLocStrategy` /
:class:`MixVprStrategy` (AZ-339), :class:`SelaVprStrategy` /
:class:`EigenPlacesStrategy` / :class:`SaladStrategy` (AZ-340) — live
in sibling modules and are imported lazily by
:mod:`gps_denied_onboard.runtime_root.vpr_factory`.
The contract at
``_docs/02_document/contracts/c2_vpr/vpr_strategy_protocol.md`` v1.0.0
is the authoritative shape; this module mirrors it 1:1.
"""
from __future__ import annotations
from typing import Protocol
from typing import TYPE_CHECKING, Protocol, runtime_checkable
from gps_denied_onboard._types.vpr import VprQuery, VprResult
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.vpr import VprQuery, VprResult
__all__ = ["VprStrategy"]
@runtime_checkable
class VprStrategy(Protocol):
"""Visual Place Recognition strategy: encode → retrieve top-K candidates."""
"""Single-camera visual place recognition strategy.
def retrieve(self, query: VprQuery, top_k: int = 10) -> VprResult: ...
Stateless per-frame; the only persistent state is the loaded
backbone weights and the C6-owned FAISS index handle (passed in
via constructor by the strategy's ``create(...)`` factory).
Invariants (see ``vpr_strategy_protocol.md`` v1.0.0):
- **INV-1 single-threaded** — each instance is bound to one
ingest thread; the composition root enforces. Concurrent
:meth:`embed_query` calls on a single instance race the GPU
stream.
- **INV-2 stateless per-frame** — no implicit dependency on
prior frames; reordering :meth:`embed_query` calls yields
identical embeddings.
- **INV-3 L2-normalised** — :attr:`VprQuery.embedding` is
L2-normalised before return (cosine ≡ Euclidean on the
FAISS HNSW lookup).
- **INV-4 top-K size + order** — :meth:`retrieve_topk` returns
exactly ``k`` candidates, ascending by
:attr:`VprCandidate.descriptor_distance`.
- **INV-5 backbone_label non-empty** — every
:attr:`VprResult.backbone_label` matches the strategy's
``BUILD_VPR_<variant>`` lowercase form.
- **INV-6 deterministic** — same frame + calibration + corpus
→ identical embedding + identical top-K (bit-exact for
float32; ULP-tolerant for float16).
- **INV-7 descriptor_dim stable** — :meth:`descriptor_dim`
never changes after construction; reflects the loaded
weights' output dim, NOT a config knob.
Error envelope: only members of
:class:`gps_denied_onboard.components.c2_vpr.errors.VprError`
escape the three methods. Lower-level exceptions (CUDA, TRT,
FAISS) MUST be rewrapped by the concrete strategy.
"""
def embed_query(
self,
frame: "NavCameraFrame",
calibration: "CameraCalibration",
) -> "VprQuery":
"""Run the backbone forward pass; return a ``VprQuery``.
Calibration is consumed by the strategy's internal
:class:`BackbonePreprocessor` for resize / crop / normalise.
Raises :class:`VprBackboneError` on backbone failure
(CUDA OOM, TRT deserialize mismatch, etc.) and
:class:`VprPreprocessError` on preprocessor contract
violation.
"""
...
def retrieve_topk(self, query: "VprQuery", k: int) -> "VprResult":
"""Run the FAISS HNSW top-K lookup against the corpus index.
The strategy holds the FAISS index handle
(constructor-injected from C6's ``DescriptorIndex``).
Top-K candidates are returned ascending by
:attr:`VprCandidate.descriptor_distance`.
Raises :class:`IndexUnavailableError` when the FAISS index
handle is invalid (post-F8 reboot before warm-up;
out-of-band file replacement caught by mmap defence;
fewer than ``k`` indexed vectors).
"""
...
def descriptor_dim(self) -> int:
"""Backbone embedding dimensionality.
Examples: 512 for UltraVPR; 4096 for NetVLAD-VGG16.
Stable for the strategy's lifetime. Consumed by the
composition root at startup to pre-validate index
compatibility against the C6 ``DescriptorIndex`` sidecar
(mismatch → :class:`ConfigError` at startup, NOT at first
frame).
"""
...
@@ -1,5 +1,26 @@
"""C3.5 AdHoP Refinement component — Public API."""
"""C3.5 AdHoP / conditional refiner — Public API (AZ-348).
from gps_denied_onboard.components.c3_5_adhop.interface import AdHoPRefinementStrategy
Per ``conditional_refiner_protocol.md`` v1.0.0 the public surface
consists of:
__all__ = ["AdHoPRefinementStrategy"]
- :class:`ConditionalRefiner` Protocol (two methods).
- :class:`C3_5RefinerConfig` config block (registered on import).
The error family (:class:`RefinerError`,
:class:`RefinerBackboneError`, :class:`RefinerConfigError`) and
both concrete strategies (:class:`PassthroughRefiner`,
:class:`AdHoPRefiner`) are intentionally NOT in ``__all__`` per
task spec AC-8: consumers see only the Protocol; concrete
strategies are reached via the runtime-root factory.
"""
from gps_denied_onboard.components.c3_5_adhop.config import C3_5RefinerConfig
from gps_denied_onboard.components.c3_5_adhop.interface import ConditionalRefiner
from gps_denied_onboard.config.schema import register_component_block
register_component_block("c3_5_adhop", C3_5RefinerConfig)
__all__ = [
"C3_5RefinerConfig",
"ConditionalRefiner",
]
@@ -0,0 +1,64 @@
"""C3.5 ``ConditionalRefiner`` config block (AZ-348).
Registered into ``config.components['c3_5_adhop']`` by the
package ``__init__.py``. The composition-root factory
:func:`gps_denied_onboard.runtime_root.refiner_factory.build_refiner_strategy`
reads this block to select the strategy and configure thresholds.
``strategy`` selects one of the two concrete refiners
(``adhop`` — production-default; ``passthrough`` — baseline /
smoke / IT-12 comparison). Both modules are linked
unconditionally: there is NO ``BUILD_REFINER_*`` flag (NOT ADR-002
territory). Runtime selection only.
``residual_threshold_px`` is the conditional-gate threshold: a
:class:`MatchResult` whose ``reprojection_residual_px <=
threshold`` is passed through unchanged; ``>`` invokes the
strategy's refinement procedure. Default 2.5 px (the AC-NEW-5 /
R10 tunable from operator tooling).
``invocation_rate_warn_threshold`` is the rolling-60 s
invocation-rate ceiling above which a WARN log fires
(C3.5-IT-03 / NFT-PERF-01). Must be in ``(0, 1)``; default 0.25.
"""
from __future__ import annotations
from dataclasses import dataclass
from typing import Final
from gps_denied_onboard.config.schema import ConfigError
__all__ = [
"C3_5RefinerConfig",
"KNOWN_STRATEGIES",
]
KNOWN_STRATEGIES: Final[frozenset[str]] = frozenset({"adhop", "passthrough"})
@dataclass(frozen=True)
class C3_5RefinerConfig:
"""Per-component config for C3.5 conditional refiner."""
strategy: str = "adhop"
residual_threshold_px: float = 2.5
invocation_rate_warn_threshold: float = 0.25
def __post_init__(self) -> None:
if self.strategy not in KNOWN_STRATEGIES:
raise ConfigError(
f"C3_5RefinerConfig.strategy={self.strategy!r} not in "
f"{sorted(KNOWN_STRATEGIES)}"
)
if self.residual_threshold_px <= 0.0:
raise ConfigError(
"C3_5RefinerConfig.residual_threshold_px must be > 0; "
f"got {self.residual_threshold_px}"
)
if not (0.0 < self.invocation_rate_warn_threshold < 1.0):
raise ConfigError(
"C3_5RefinerConfig.invocation_rate_warn_threshold must be in "
f"(0, 1); got {self.invocation_rate_warn_threshold}"
)
@@ -0,0 +1,44 @@
"""C3.5 ``ConditionalRefiner`` error taxonomy (AZ-348).
The family is intentionally small: per-candidate failures are
handled inside C3 (drop-and-continue); at C3.5 the only failure
mode is the AdHoP backbone (TensorRT exception, OOM, NaN, shape
mismatch) and it is contained within the strategy via
passthrough fall-through (contract Invariant 4) — never re-raised
out of :meth:`refine_if_needed`.
:class:`RefinerConfigError` is the composition-root rejection at
startup; never raised per-frame.
"""
from __future__ import annotations
__all__ = [
"RefinerBackboneError",
"RefinerConfigError",
"RefinerError",
]
class RefinerError(Exception):
"""Base class for all C3.5 refinement-strategy errors."""
class RefinerBackboneError(RefinerError):
"""AdHoP backbone forward-pass failed.
TensorRT exception, OOM, NaN, shape mismatch. Caught inside
:meth:`ConditionalRefiner.refine_if_needed`, converted to
passthrough fall-through (Invariant 4), logged at ERROR with
one FDR record. NEVER re-raised out of the strategy.
"""
class RefinerConfigError(RefinerError):
"""Composition-root rejected the refiner config.
Unknown strategy label OR invalid threshold (``<= 0``). Raised
at startup ONLY — never per-frame. The composition root logs
ERROR (``kind="c3_5.refiner.strategy_unknown"`` or
``kind="c3_5.refiner.invalid_threshold"``) before raising.
"""
@@ -1,16 +1,123 @@
"""C3.5 `AdHoPRefinementStrategy` Protocol.
"""C3.5 ``ConditionalRefiner`` Protocol (AZ-348).
Concrete impl: AdHoP refiner. See `_docs/02_document/components/05_c3_5_adhop/`.
PEP 544 ``typing.Protocol`` with ``runtime_checkable=True``; a
two-method surface: :meth:`refine_if_needed` and
:meth:`was_invoked`.
Concrete impls — :class:`PassthroughRefiner` (this task) and
:class:`AdHoPRefiner` (AZ-349) — live in sibling modules.
Both are linked into the production binary unconditionally per
ADR-001; runtime selection is via ``config.refiner.strategy``
(NO ``BUILD_REFINER_*`` flag — NOT ADR-002 territory because
both strategies are tiny and the AdHoP TRT engine is shared C7
infrastructure).
The contract at
``_docs/02_document/contracts/c3_5_adhop/conditional_refiner_protocol.md``
v1.0.0 is the authoritative shape; this module mirrors it 1:1.
"""
from __future__ import annotations
from typing import Protocol
from typing import TYPE_CHECKING, Protocol, runtime_checkable
from gps_denied_onboard._types.matching import MatchResult
if TYPE_CHECKING:
from gps_denied_onboard._types.matcher import MatchResult
from gps_denied_onboard._types.nav import NavCameraFrame
__all__ = ["ConditionalRefiner"]
class AdHoPRefinementStrategy(Protocol):
"""Conditional refinement of a `MatchResult` (geometric verification + outlier purge)."""
@runtime_checkable
class ConditionalRefiner(Protocol):
"""Conditional refinement strategy between C3 (matcher) and C4 (pose).
def refine(self, match: MatchResult) -> MatchResult: ...
Stateless per-frame; the only persistent state is the
constructor-injected backbone runtime handle (when the strategy
uses one) and the last-invocation flag.
Invariants (see ``conditional_refiner_protocol.md`` v1.0.0):
- **INV-1 single-threaded** — each instance is bound to one
ingest thread; same thread as C3 (shared C-frame ingest
path).
- **INV-2 stateless per-frame** — except for the
``was_invoked`` flag, no implicit dependency on prior frames;
reordering :meth:`refine_if_needed` calls (tests only) MUST
yield identical output ``MatchResult`` content.
- **INV-3 conditional gate is a pure comparison** —
``mr.reprojection_residual_px <= threshold`` → passthrough;
``>`` → invoke. No tolerance, no smoothing, no hysteresis.
- **INV-4 passthrough fall-through on backbone error** —
:class:`RefinerBackboneError` raised inside the invoked path
is caught by the strategy and converted to passthrough output
with ``refinement_label = "passthrough"``; the error is
logged at ERROR + emitted to FDR; NEVER re-raised out of
:meth:`refine_if_needed`.
- **INV-5 bit-identical correspondences on passthrough** —
when ``refinement_label == "passthrough"``, every
``inlier_correspondences`` ndarray in the output equals the
input ndarray bit-for-bit (``np.array_equal`` AND same
``dtype``).
- **INV-6 ``refinement_label`` is `"adhop"` OR
`"passthrough"`** — exactly one of those two values; matches
the strategy's selected variant. Readers check
:meth:`was_invoked` to discriminate "AdHoP ran" from
"AdHoP-fell-through-to-passthrough".
- **INV-7 ``refinement_added_latency_ms`` is strategy-internal
added latency** — always ``>= 0``; near-zero on passthrough;
up to ~90 ms on AdHoP invoke per C3.5-PT-01.
- **INV-8 ``was_invoked()`` semantics** — set to ``True`` iff
the strategy entered the refinement procedure (post-gate,
regardless of whether AdHoP succeeded or fell through). On
pure passthrough strategy + every gate-decided-passthrough
call: ``False``.
- **INV-9 threshold validation** — the strategy MUST reject
``residual_threshold_px <= 0`` (raise :class:`ValueError`);
the composition root validates the config-loaded threshold
at startup so this in-method check is defensive.
"""
def refine_if_needed(
self,
frame: "NavCameraFrame",
mr: "MatchResult",
residual_threshold_px: float,
) -> "MatchResult":
"""Either pass ``mr`` through unchanged or run refinement.
If ``mr.reprojection_residual_px <= residual_threshold_px``
(the steady-state path), return ``mr`` unchanged AND set
:meth:`was_invoked` to ``False``. Otherwise run the
strategy's refinement procedure and return an enriched
:class:`MatchResult` (typically via
:func:`dataclasses.replace`) with ``refinement_label``
set, AND set :meth:`was_invoked` to ``True``.
On :class:`RefinerBackboneError` (AdHoP backbone failure
during the invoked path), the refiner MUST fall through to
passthrough — return ``mr`` unchanged with
``refinement_label = "passthrough"`` AND :meth:`was_invoked`
``True`` (the attempt counts towards the invocation rate
even on failure). The error is logged at ERROR + emitted
to FDR; downstream pose estimation gets a usable
:class:`MatchResult` and decides whether to trigger F6.
Determinism: same inputs MUST produce the same output. No
probabilistic gating; no time-based gating.
Raises:
ValueError: ``residual_threshold_px <= 0`` (defensive;
composition root should have caught this already).
"""
...
def was_invoked(self) -> bool:
"""Return ``True`` iff the last :meth:`refine_if_needed`
call entered the refinement procedure.
Set at the start of every :meth:`refine_if_needed` call.
Used by FDR per-frame provenance and by
NFT-PERF-01 / C3.5-IT-03 invocation-rate accounting.
"""
...
@@ -0,0 +1,84 @@
"""``PassthroughRefiner`` — no-op :class:`ConditionalRefiner` (AZ-348).
The reference baseline implementation. Returns the input
:class:`MatchResult` unchanged (same object reference;
``inlier_correspondences`` ndarrays are bit-identical and share
references per contract INV-5). Always sets :meth:`was_invoked`
to ``False`` per INV-8.
Both helpers (``ransac_filter``, ``inference_runtime``) are held
by reference for parity with :class:`AdHoPRefiner` (AZ-349) but
neither is invoked. ``inference_runtime`` is typed as ``object``
because the C3 matcher → C3.5 refiner → C4 pose layering forbids
L3-to-L3 imports (architecture test ``test_az270_compose_root``);
the composition-root factory at
:mod:`gps_denied_onboard.runtime_root.refiner_factory`
narrows the type at construction time.
"""
from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from gps_denied_onboard._types.matcher import MatchResult
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.helpers.ransac_filter import RansacFilter
__all__ = ["PassthroughRefiner", "create"]
class PassthroughRefiner:
"""Reference passthrough strategy.
See module docstring. Stateless except for the ``_was_invoked``
flag (always ``False`` per INV-8); concurrent calls are unsafe
(single-thread invariant covers).
"""
def __init__(
self,
*,
ransac_filter: "RansacFilter",
inference_runtime: object,
) -> None:
# Held for parity with AdHoPRefiner; neither is invoked.
self._ransac_filter = ransac_filter
self._inference_runtime = inference_runtime
self._was_invoked: bool = False
def refine_if_needed(
self,
frame: "NavCameraFrame",
mr: "MatchResult",
residual_threshold_px: float,
) -> "MatchResult":
if residual_threshold_px <= 0.0:
raise ValueError(
"residual_threshold_px must be > 0; "
f"got {residual_threshold_px}"
)
self._was_invoked = False
# `MatchResult` defaults `refinement_label="passthrough"`
# and `refinement_added_latency_ms=0.0` already; the input
# is bit-identical per contract INV-5 so we return it by
# reference rather than recreating via `dataclasses.replace`.
return mr
def was_invoked(self) -> bool:
return self._was_invoked
def create(
config: "Config",
*,
ransac_filter: "RansacFilter",
inference_runtime: object,
) -> PassthroughRefiner:
"""Module-level factory entry point consumed by the
composition-root :func:`build_refiner_strategy`."""
return PassthroughRefiner(
ransac_filter=ransac_filter,
inference_runtime=inference_runtime,
)
@@ -1,6 +1,51 @@
"""C3 Cross-Domain Matcher component — Public API."""
"""C3 cross-domain matcher — Public API (AZ-344).
from gps_denied_onboard._types.matching import MatchResult
Per ``cross_domain_matcher_protocol.md`` v1.0.0 the public surface
consists of:
- :class:`CrossDomainMatcher` Protocol (two methods).
- DTOs re-exported from :mod:`gps_denied_onboard._types.matcher`
(the L1 home for cross-component DTOs):
:class:`MatchResult`, :class:`MatcherHealth`. The internal
per-candidate sub-DTO :class:`CandidateMatchSet` is intentionally
re-exported too because C3.5 reads
``MatchResult.per_candidate[i]`` directly.
- Error family rooted at :class:`MatcherError`; two documented
subtypes (:class:`MatcherBackboneError`,
:class:`InsufficientInliersError`).
- Config block :class:`C3MatcherConfig` (registered on import).
Internals — :class:`RollingHealthWindow` and the concrete strategy
modules (``disk_lightglue``, ``aliked_lightglue``, ``xfeat``) —
are intentionally NOT re-exported: consumers see only the
Protocol; concrete strategies are imported lazily by
:mod:`gps_denied_onboard.runtime_root.matcher_factory` (Risk-2
mitigation).
"""
from gps_denied_onboard._types.matcher import (
CandidateMatchSet,
MatchResult,
MatcherHealth,
)
from gps_denied_onboard.components.c3_matcher.config import C3MatcherConfig
from gps_denied_onboard.components.c3_matcher.errors import (
InsufficientInliersError,
MatcherBackboneError,
MatcherError,
)
from gps_denied_onboard.components.c3_matcher.interface import CrossDomainMatcher
from gps_denied_onboard.config.schema import register_component_block
__all__ = ["CrossDomainMatcher", "MatchResult"]
register_component_block("c3_matcher", C3MatcherConfig)
__all__ = [
"C3MatcherConfig",
"CandidateMatchSet",
"CrossDomainMatcher",
"InsufficientInliersError",
"MatchResult",
"MatcherBackboneError",
"MatcherError",
"MatcherHealth",
]
@@ -0,0 +1,122 @@
"""C3 rolling matcher-health accumulator (AZ-344).
:class:`RollingHealthWindow` maintains the three accumulators that
back :class:`MatcherHealth` snapshots: ``consecutive_low_inlier``,
``mean_inliers_60s``, ``backbone_error_count_60s``. The 60 s window
is configurable for tests via the constructor.
The structure is intentionally **single-thread** — no locks. The
composition root binds every :class:`CrossDomainMatcher` to one
ingest thread (AC-9) so adding a lock here would only mask binding
bugs.
Data structure choice: a ``collections.deque`` of
``(timestamp_ns, inlier_count, had_backbone_error)`` tuples plus
two running sums (``_inlier_sum``, ``_error_sum``). Every
:meth:`update` call evicts expired entries from the left while
maintaining the sums — amortised O(1). :meth:`snapshot` is strict
O(1): it reads the sums and the current ``len(self._window)``
without touching the deque body, so the NFR-perf-window microbench
(p99 ≤ 50 µs) holds even with 6000 entries (100 Hz × 60 s).
"""
from __future__ import annotations
from collections import deque
from typing import Final
from gps_denied_onboard._types.matcher import MatcherHealth
__all__ = ["RollingHealthWindow"]
_DEFAULT_WINDOW_NS: Final[int] = 60 * 1_000_000_000
class RollingHealthWindow:
"""Sliding 60 s window over best-candidate inlier counts.
Constructor-injected into every concrete :class:`CrossDomainMatcher`
so all strategies share semantics (the alternative — every
matcher reimplements the window — drifts between backbones and
breaks C5's spoof-promotion gate consistency).
"""
def __init__(
self,
*,
min_inliers_threshold: int,
window_ns: int = _DEFAULT_WINDOW_NS,
) -> None:
if min_inliers_threshold < 1:
raise ValueError(
"min_inliers_threshold must be >= 1; "
f"got {min_inliers_threshold}"
)
if window_ns < 1:
raise ValueError(f"window_ns must be >= 1; got {window_ns}")
self._min_inliers_threshold: int = min_inliers_threshold
self._window_ns: int = window_ns
# Each entry: (timestamp_ns, inlier_count, had_backbone_error)
self._window: deque[tuple[int, int, bool]] = deque()
self._inlier_sum: int = 0
self._error_sum: int = 0
self._consecutive_low_inlier: int = 0
@property
def window_ns(self) -> int:
return self._window_ns
def update(
self,
*,
timestamp_ns: int,
best_inlier_count: int,
had_backbone_error: bool,
) -> None:
"""Record one frame's outcome and evict any expired entries.
``best_inlier_count`` is the BEST candidate's inlier count
for the frame (zero if every candidate was dropped /
below-threshold). ``had_backbone_error`` is ``True`` if at
least one per-candidate
:class:`MatcherBackboneError` fired in this frame.
AC-12: ``consecutive_low_inlier`` increments when
``best_inlier_count < min_inliers_threshold``; resets to
zero on any frame whose count meets or exceeds the floor.
"""
if best_inlier_count < 0:
raise ValueError(
f"best_inlier_count must be >= 0; got {best_inlier_count}"
)
cutoff = timestamp_ns - self._window_ns
window = self._window
while window and window[0][0] <= cutoff:
_ts, expired_inliers, expired_error = window.popleft()
self._inlier_sum -= expired_inliers
if expired_error:
self._error_sum -= 1
window.append((timestamp_ns, best_inlier_count, had_backbone_error))
self._inlier_sum += best_inlier_count
if had_backbone_error:
self._error_sum += 1
if best_inlier_count < self._min_inliers_threshold:
self._consecutive_low_inlier += 1
else:
self._consecutive_low_inlier = 0
def snapshot(self) -> MatcherHealth:
"""Return the current :class:`MatcherHealth` snapshot.
O(1): reads the running sums + ``len(self._window)`` only.
Empty window → ``mean_inliers_60s == 0.0`` (consumers
treat zero as "insufficient data" rather than "zero matches").
"""
count = len(self._window)
mean = (self._inlier_sum / count) if count else 0.0
return MatcherHealth(
consecutive_low_inlier=self._consecutive_low_inlier,
mean_inliers_60s=mean,
backbone_error_count_60s=self._error_sum,
)
@@ -0,0 +1,67 @@
"""C3 ``CrossDomainMatcher`` config block (AZ-344).
Registered into ``config.components['c3_matcher']`` by the package
``__init__.py``. The composition-root factory
:func:`gps_denied_onboard.runtime_root.matcher_factory.build_matcher_strategy`
reads this block to select the strategy and configure thresholds.
``strategy`` selects one of the three concrete backbones
(``disk_lightglue``, ``aliked_lightglue``, ``xfeat``); the
composition-root factory respects compile-time
``BUILD_MATCHER_<variant>`` gating on top of this label.
``min_inliers_threshold`` is the per-candidate floor: candidates
whose RANSAC inlier count falls below this value are treated as
failed (drop-and-continue) and counted into
``MatcherHealth.consecutive_low_inlier``. Default 60 — leaves
headroom below the AC-1.1 floor (p5 ≥ 80) so calibration drift
does not immediately trip the spoof gate; FT-P-19 telemetry will
tune it.
``residual_warn_threshold_px`` is the median reprojection-residual
limit (pixels) above which the matcher emits a WARN log; default
2.5 px (the AC-1.2 floor).
"""
from __future__ import annotations
from dataclasses import dataclass
from typing import Final
from gps_denied_onboard.config.schema import ConfigError
__all__ = [
"C3MatcherConfig",
"KNOWN_STRATEGIES",
]
KNOWN_STRATEGIES: Final[frozenset[str]] = frozenset(
{"disk_lightglue", "aliked_lightglue", "xfeat"}
)
@dataclass(frozen=True)
class C3MatcherConfig:
"""Per-component config for C3 cross-domain matcher."""
strategy: str = "disk_lightglue"
min_inliers_threshold: int = 60
residual_warn_threshold_px: float = 2.5
def __post_init__(self) -> None:
if self.strategy not in KNOWN_STRATEGIES:
raise ConfigError(
f"C3MatcherConfig.strategy={self.strategy!r} not in "
f"{sorted(KNOWN_STRATEGIES)}"
)
if self.min_inliers_threshold < 1:
raise ConfigError(
"C3MatcherConfig.min_inliers_threshold must be >= 1; "
f"got {self.min_inliers_threshold}"
)
if self.residual_warn_threshold_px <= 0.0:
raise ConfigError(
"C3MatcherConfig.residual_warn_threshold_px must be > 0; "
f"got {self.residual_warn_threshold_px}"
)
@@ -0,0 +1,55 @@
"""C3 ``CrossDomainMatcher`` error taxonomy (AZ-344).
The family is intentionally narrow: a per-candidate failure is the
normal case (drop-and-continue, INV-4) and is signalled via
``MatchResult.candidates_dropped`` — NOT via an exception. An
exception escapes :meth:`CrossDomainMatcher.match` only when EVERY
candidate fails OR every candidate's inlier count falls below
``config.matcher.min_inliers_threshold``; both surface as
:class:`InsufficientInliersError` which is the C5 → VIO-only
fallback trigger per AC-3.5.
:class:`MatcherBackboneError` is raised INSIDE the per-candidate
loop, caught by the strategy, logged ERROR, FDR-stamped, and the
candidate is dropped. It is exposed publicly so the per-candidate
log + FDR taxonomy is observable and so future matchers using a
different backbone can re-raise the same kind.
"""
from __future__ import annotations
__all__ = [
"InsufficientInliersError",
"MatcherBackboneError",
"MatcherError",
]
class MatcherError(Exception):
"""Base class for the C3 matcher error family.
Caught at the runtime root only when
:class:`InsufficientInliersError` fires; per-candidate failures
stay inside the strategy.
"""
class MatcherBackboneError(MatcherError):
"""Per-candidate backbone forward-pass failure.
CUDA OOM, TRT engine deserialize mismatch. Logged at ERROR; one
FDR record per occurrence; the offending candidate is dropped
from the match set; the surrounding :meth:`match` call continues
with the remaining candidates (INV-4).
"""
class InsufficientInliersError(MatcherError):
"""Zero survivors after the per-candidate loop, OR every
candidate's inlier count is below
``config.matcher.min_inliers_threshold``.
Logged at ERROR; FDR record ``kind=matcher.insufficient_inliers``
or ``kind=matcher.all_failed`` (per the trigger). C5 falls back
to VIO-only with provenance ``visual_propagated`` (AC-3.5).
"""
@@ -1,19 +1,120 @@
"""C3 `CrossDomainMatcher` Protocol.
"""C3 ``CrossDomainMatcher`` Protocol (AZ-344).
Concrete impls: DISK+LightGlue (primary), ALIKED+LightGlue, XFeat. See
`_docs/02_document/components/04_c3_matcher/`.
PEP 544 ``typing.Protocol`` with ``runtime_checkable=True``; a
two-method surface: :meth:`match` and :meth:`health_snapshot`.
Concrete impls — DISK+LightGlue (AZ-345), ALIKED+LightGlue
(AZ-346), XFeat (AZ-347) — live in sibling modules and are imported
lazily by
:mod:`gps_denied_onboard.runtime_root.matcher_factory`.
The contract at
``_docs/02_document/contracts/c3_matcher/cross_domain_matcher_protocol.md``
v1.0.0 is the authoritative shape; this module mirrors it 1:1.
"""
from __future__ import annotations
from typing import Protocol
from typing import TYPE_CHECKING, Protocol, runtime_checkable
from gps_denied_onboard._types.matching import MatchResult
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.tile import Tile
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.matcher import MatchResult, MatcherHealth
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.rerank import RerankResult
__all__ = ["CrossDomainMatcher"]
@runtime_checkable
class CrossDomainMatcher(Protocol):
"""Match a nav-camera frame against a satellite tile."""
"""Cross-domain (nav-camera ↔ satellite-imagery) matcher strategy.
def match(self, frame: NavCameraFrame, tile: Tile) -> MatchResult: ...
Stateless per-frame; the only persistent state is the
constructor-injected backbone runtime handles
(``InferenceRuntime`` for feature extraction,
:class:`gps_denied_onboard.helpers.lightglue_runtime.LightGlueRuntime`
for matching,
:class:`gps_denied_onboard.helpers.ransac_filter.RansacFilter`
for inlier filtering) and the rolling
:class:`gps_denied_onboard.components.c3_matcher._health_window.RollingHealthWindow`.
Invariants (see ``cross_domain_matcher_protocol.md`` v1.0.0):
- **INV-1 single-threaded** — each instance is bound to one
ingest thread by the composition root (same thread as C2.5
because they share the ``LightGlueRuntime``).
- **INV-2 stateless per-frame for `match`** — except for the
rolling health window, no implicit dependency on prior
frames; reordering ``match`` calls (tests only) MUST yield
identical ``MatchResult`` content.
- **INV-3 best-candidate selection is deterministic** —
``MatchResult.best_candidate_idx == 0`` and
``per_candidate`` is sorted by ``inlier_count`` descending
with ties broken by ``per_candidate_residual_px`` ascending
(lower residual wins).
- **INV-4 drop-and-continue per candidate** — per-candidate
exceptions never propagate out of ``match`` unless every
candidate fails. Mirrors C2.5 INV-8.
- **INV-5 ``per_candidate`` length is bounded** —
``0 < len <= len(rerank_result.candidates)``; zero raises
:class:`InsufficientInliersError`; never exceeds the input N.
- **INV-6 ``matcher_label`` is non-empty** — every
``MatchResult`` carries the strategy's name
(e.g., ``"disk_lightglue"``) for FDR provenance; MUST match
``BUILD_MATCHER_<variant>`` lowercase.
- **INV-7 ``inlier_correspondences`` shape contract** —
``ndarray[I, 4, dtype=float32]``, columns
``(px_query, py_query, px_tile, py_tile)``; rows are RANSAC
inliers only; ``I == inlier_count``.
- **INV-8 ``reprojection_residual_px`` is the BEST candidate's
median residual** — not the mean, not the max; downstream
C3.5's threshold gate compares against this value.
- **INV-9 ``health_snapshot()`` is cheap** — O(1); reads the
rolling window's pre-computed accumulators. Never recomputes
over the window contents.
Error envelope: only :class:`InsufficientInliersError` escapes
:meth:`match`; per-candidate
:class:`MatcherBackboneError` and C6 ``TileFetchError``
instances are caught inside the loop and turned into dropped
candidates + ERROR logs + per-occurrence FDR records.
"""
def match(
self,
frame: "NavCameraFrame",
rerank_result: "RerankResult",
calibration: "CameraCalibration",
) -> "MatchResult":
"""Match a frame against every candidate in ``rerank_result``.
For each candidate:
1. Extract features from the nav frame and from the tile
pixels (via the constructor-injected
:class:`InferenceRuntime` handle).
2. Run LightGlue forward via the shared
:class:`LightGlueRuntime`.
3. RANSAC-filter correspondences via the shared
:class:`RansacFilter`; record inliers + median residual.
Sort survivors by ``(inlier_count desc, residual asc)`` and
return as :class:`MatchResult`. Drop-and-continue
semantics apply per INV-4.
Raises:
InsufficientInliersError: zero survivors after the
per-candidate loop, OR every candidate's
``inlier_count`` is below
``config.matcher.min_inliers_threshold``.
"""
...
def health_snapshot(self) -> "MatcherHealth":
"""Return a rolling-window matcher health snapshot.
O(1) per INV-9. Drives C5's spoof-promotion gate
(AC-NEW-2 / AC-NEW-7) and post-flight forensics.
"""
...
@@ -26,7 +26,7 @@ from typing import TYPE_CHECKING, Protocol, runtime_checkable
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.matching import MatchResult
from gps_denied_onboard._types.matcher import MatchResult
from gps_denied_onboard._types.pose import CovarianceMode, PoseEstimate
from gps_denied_onboard._types.thermal import ThermalState
@@ -37,7 +37,6 @@ transitions.
from __future__ import annotations
import threading
import time
from collections.abc import Callable
from datetime import datetime, timezone
from typing import TYPE_CHECKING, Final, Protocol, runtime_checkable
@@ -107,8 +106,8 @@ class FallbackWatcher:
*,
threshold_s: float,
fdr_client: FdrClient | None,
clock_ns: Callable[[], int],
producer_id: str = "c5_state",
clock_ns: Callable[[], int] = time.monotonic_ns,
) -> None:
if threshold_s <= 0.0:
raise ValueError(f"FallbackWatcher.threshold_s must be > 0; got {threshold_s}")
@@ -18,7 +18,6 @@ defensive trace.
from __future__ import annotations
import time
from typing import TYPE_CHECKING, Any, Protocol, runtime_checkable
import gtsam
@@ -205,5 +204,10 @@ class ISam2GraphHandleImpl(ISam2GraphHandle):
anchor (``_last_anchor_ns`` is initialised to 0 in the
estimator constructor). This matches the C5 contract's
documented "no anchor yet" sentinel.
Reads the estimator's injected :class:`Clock` so replay /
unit-test runs see deterministic age values.
"""
return (time.monotonic_ns() - self._estimator._last_anchor_ns) // 1_000_000
return (
self._estimator._clock.monotonic_ns() - self._estimator._last_anchor_ns
) // 1_000_000
@@ -35,7 +35,6 @@ matrix simpler.
from __future__ import annotations
import threading
import time
from collections.abc import Callable
from datetime import datetime, timezone
from typing import TYPE_CHECKING, Final, Protocol, runtime_checkable
@@ -154,8 +153,8 @@ class SourceLabelStateMachine:
spoof_promotion_visual_consistency_tol_m: float,
spoof_promotion_bounded_delta_m: float,
fdr_client: FdrClient | None,
clock_ns: Callable[[], int],
producer_id: str = "c5_state",
clock_ns: Callable[[], int] = time.monotonic_ns,
) -> None:
if spoof_promotion_min_stable_s <= 0.0:
raise ValueError(
@@ -47,7 +47,6 @@ filter; this module documents the deviation in the
from __future__ import annotations
import math
import time
from collections import deque
from datetime import datetime, timezone
from typing import TYPE_CHECKING, Any, Final, Literal
@@ -57,6 +56,7 @@ import numpy as np
from numpy.linalg import LinAlgError
from gps_denied_onboard._types.geo import LatLonAlt
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard._types.state import (
EstimatorHealth,
EstimatorOutput,
@@ -89,9 +89,9 @@ from gps_denied_onboard.logging import get_logger
if TYPE_CHECKING:
from gps_denied_onboard._types.fc import GpsHealth, GpsSample
from gps_denied_onboard._types.nav import ImuWindow
from gps_denied_onboard._types.nav import ImuWindow, VioOutput
from gps_denied_onboard._types.pose import PoseEstimate
from gps_denied_onboard._types.vio import VioOutput
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.config import Config
__all__ = [
@@ -162,6 +162,7 @@ class EskfStateEstimator(StateEstimator):
se3_utils: Any,
wgs_converter: Any,
fdr_client: Any,
clock: Clock | None = None,
) -> None:
block = self._extract_block(config)
self._config: Config = config
@@ -170,6 +171,7 @@ class EskfStateEstimator(StateEstimator):
self._se3_utils: Any = se3_utils
self._wgs_converter: Any = wgs_converter
self._fdr_client: Any = fdr_client
self._clock: Clock = clock if clock is not None else WallClock()
self._log = get_logger("c5_state.eskf_baseline")
self._nominal_pos: np.ndarray = np.zeros(3, dtype=np.float64)
@@ -215,6 +217,7 @@ class EskfStateEstimator(StateEstimator):
spoof_promotion_visual_consistency_tol_m=block.spoof_promotion_visual_consistency_tol_m,
spoof_promotion_bounded_delta_m=block.spoof_promotion_bounded_delta_m,
fdr_client=fdr_client,
clock_ns=self._clock.monotonic_ns,
producer_id="c5_state",
)
@@ -222,6 +225,7 @@ class EskfStateEstimator(StateEstimator):
self._fallback = FallbackWatcher(
threshold_s=block.no_estimate_fallback_s,
fdr_client=fdr_client,
clock_ns=self._clock.monotonic_ns,
producer_id="c5_state",
)
@@ -449,9 +453,9 @@ class EskfStateEstimator(StateEstimator):
residual in the previous body frame.
"""
self._close_cold_start_window()
ts_ns = _datetime_to_ns(vio.timestamp)
ts_ns = vio.emitted_at_ns
self._guard_timestamp(ts_ns, source="vio")
curr_pose = _pose_se3_to_array(vio.pose_se3)
curr_pose = vio.relative_pose_T.matrix()
if self._prev_vio_pose is None:
self._prev_vio_pose = curr_pose
@@ -498,7 +502,7 @@ class EskfStateEstimator(StateEstimator):
H = np.zeros((6, _N_STATE), dtype=np.float64)
H[0:3, _IDX_POS] = np.eye(3)
H[3:6, _IDX_ROT] = prev_R # rotate body-frame perturbation back to world
R_meas = _measurement_noise(vio.covariance_6x6)
R_meas = _measurement_noise(vio.pose_covariance_6x6)
try:
self._kalman_update(H, residual, R_meas, source="vio")
@@ -538,7 +542,7 @@ class EskfStateEstimator(StateEstimator):
# Both modes are treated identically by the ESKF — the
# JACOBIAN exclusion is iSAM2-graph-specific. AC-4.
self._last_anchor_ns = time.monotonic_ns()
self._last_anchor_ns = self._clock.monotonic_ns()
residual_pos = meas_pose[:3, 3] - self._nominal_pos
meas_R = meas_pose[:3, :3]
@@ -612,7 +616,7 @@ class EskfStateEstimator(StateEstimator):
def current_estimate(self) -> EstimatorOutput:
"""Forward-time estimate. ``smoothed=False`` (Invariant 7)."""
now_ns = time.monotonic_ns()
now_ns = self._clock.monotonic_ns()
self._fallback.check_and_engage(now_ns)
cov6 = self._pose_covariance_6x6()
@@ -629,7 +633,7 @@ class EskfStateEstimator(StateEstimator):
)
raise
emitted_at = time.monotonic_ns()
emitted_at = self._clock.monotonic_ns()
position_wgs84 = self._enu_pose_to_wgs84()
orientation = _quat_to_quat_dto(self._nominal_q)
velocity_world = (
@@ -864,7 +868,7 @@ class EskfStateEstimator(StateEstimator):
return
try:
machine.notify_satellite_anchor(
now_ns=time.monotonic_ns(),
now_ns=self._clock.monotonic_ns(),
gps_consistency_delta_m=None,
)
except Exception as exc:
@@ -6,7 +6,7 @@ real bodies. AZ-383 owns the three Protocol factor-add methods:
* ``add_vio(vio: VioOutput)`` ``BetweenFactorPose3`` between
consecutive pose keys with a noise model derived from
``vio.covariance_6x6``.
``vio.pose_covariance_6x6``.
* ``add_pose_anchor(pose: PoseEstimate)`` mode-dispatched per
``pose.covariance_mode``: ``"marginals"`` ``PriorFactorPose3`` +
``update``; ``"jacobian"`` skip iSAM2 add (per the AZ-361 cross-task
@@ -31,7 +31,6 @@ there.
from __future__ import annotations
import math
import time
from collections import deque
from datetime import datetime, timezone
from typing import TYPE_CHECKING, Any, Final, Literal
@@ -43,6 +42,7 @@ import numpy as np
from numpy.linalg import LinAlgError
from gps_denied_onboard._types.geo import LatLonAlt
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard._types.state import (
EstimatorHealth,
EstimatorOutput,
@@ -79,9 +79,9 @@ from gps_denied_onboard.logging import get_logger
if TYPE_CHECKING:
from gps_denied_onboard._types.fc import GpsHealth, GpsSample
from gps_denied_onboard._types.nav import ImuWindow
from gps_denied_onboard._types.nav import ImuWindow, VioOutput
from gps_denied_onboard._types.pose import PoseEstimate
from gps_denied_onboard._types.vio import VioOutput
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.config import Config
__all__ = [
@@ -148,6 +148,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
se3_utils: Any,
wgs_converter: Any,
fdr_client: Any,
clock: Clock | None = None,
) -> None:
block = self._extract_block(config)
@@ -157,6 +158,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
self._se3_utils: Any = se3_utils
self._wgs_converter: Any = wgs_converter
self._fdr_client: Any = fdr_client
self._clock: Clock = clock if clock is not None else WallClock()
self._isam2 = gtsam.ISAM2(gtsam.ISAM2Params())
window_seconds: float = block.keyframe_window_size * _FRAME_PERIOD_S
@@ -224,6 +226,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
spoof_promotion_visual_consistency_tol_m=block.spoof_promotion_visual_consistency_tol_m,
spoof_promotion_bounded_delta_m=block.spoof_promotion_bounded_delta_m,
fdr_client=fdr_client,
clock_ns=self._clock.monotonic_ns,
producer_id="c5_state",
)
# AC-NEW-8 rolling window of ``(ts_monotonic_ns, cov_norm)``
@@ -255,6 +258,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
self._fallback = FallbackWatcher(
threshold_s=block.no_estimate_fallback_s,
fdr_client=fdr_client,
clock_ns=self._clock.monotonic_ns,
producer_id="c5_state",
)
@@ -403,7 +407,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
"""Register a callback fired exactly once per fallback recovery."""
return self._fallback.subscribe_recovered(callback)
def key_for_frame(self, frame_id: UUID | int) -> int:
def key_for_frame(self, frame_id: UUID | int | str) -> int:
"""Return the GTSAM ``Key`` for ``frame_id``, assigning on first use.
AZ-383 calls this from ``add_vio`` and ``add_pose_anchor`` to
@@ -481,7 +485,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
# AC-6 / Invariant 11a: do NOT advance ``_last_added_ts_ns`` —
# this is a pre-takeoff seed, not a measurement; the first
# subsequent ``add_*`` call still sees the unguarded baseline.
ts_ns = time.monotonic_ns()
ts_ns = self._clock.monotonic_ns()
try:
handle.add_factor(factor)
self._values.insert(prior_key, prior_pose)
@@ -653,10 +657,10 @@ class GtsamIsam2StateEstimator(StateEstimator):
"""
handle = self._require_handle()
self._close_cold_start_window()
ts_ns = _datetime_to_ns(vio.timestamp)
ts_ns = vio.emitted_at_ns
self._guard_timestamp(ts_ns, source="vio")
curr_pose = _pose_se3_to_gtsam(vio.pose_se3)
curr_pose = vio.relative_pose_T
curr_key = self.key_for_frame(vio.frame_id)
if self._prev_vio is None:
@@ -674,10 +678,10 @@ class GtsamIsam2StateEstimator(StateEstimator):
)
return
prev_pose = _pose_se3_to_gtsam(self._prev_vio.pose_se3)
prev_pose = self._prev_vio.relative_pose_T
prev_key = self.key_for_frame(self._prev_vio.frame_id)
relative_pose = prev_pose.between(curr_pose)
noise = _build_pose_noise(vio.covariance_6x6)
noise = _build_pose_noise(vio.pose_covariance_6x6)
factor = gtsam.BetweenFactorPose3(prev_key, curr_key, relative_pose, noise)
try:
@@ -734,7 +738,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
# Both paths update the anchor freshness sentinel. The C5
# contract documents this — even the throttled JACOBIAN path
# counts as a recent anchor for AC-1.3 binning.
self._last_anchor_ns = time.monotonic_ns()
self._last_anchor_ns = self._clock.monotonic_ns()
if mode == "marginals":
gtsam_pose = _pose_se3_to_gtsam(self._pose_estimate_to_matrix(pose))
@@ -923,7 +927,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
# AZ-388: AC-5.2 entry hook. Engages fallback if the
# threshold has elapsed since the last successful estimate.
# Idempotent / rate-limited.
self._fallback.check_and_engage(time.monotonic_ns())
self._fallback.check_and_engage(self._clock.monotonic_ns())
if self._last_committed_pose_key is None:
raise EstimatorFatalError(
"current_estimate: no committed pose key yet (graph empty); "
@@ -975,7 +979,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
velocity_world = self._latest_velocity_or_zero()
last_anchor_age_ms = int(handle.last_anchor_age_ms())
source_label = self._derive_source_label()
emitted_at = time.monotonic_ns()
emitted_at = self._clock.monotonic_ns()
self._record_cov_norm_sample(emitted_at, covariance)
if self._isam2_state == IsamState.INIT:
@@ -1063,7 +1067,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
last_anchor_age_ms = int(handle.last_anchor_age_ms())
source_label = self._derive_source_label()
emitted_at = time.monotonic_ns()
emitted_at = self._clock.monotonic_ns()
out: list[EstimatorOutput] = []
for key, _ts in selected:
@@ -1366,7 +1370,7 @@ class GtsamIsam2StateEstimator(StateEstimator):
return
try:
machine.notify_satellite_anchor(
now_ns=time.monotonic_ns(),
now_ns=self._clock.monotonic_ns(),
gps_consistency_delta_m=None,
)
except Exception as exc:
@@ -26,7 +26,7 @@ if TYPE_CHECKING:
EstimatorHealth,
EstimatorOutput,
)
from gps_denied_onboard._types.vio import VioOutput
from gps_denied_onboard._types.nav import VioOutput
__all__ = ["StateEstimator"]
@@ -20,8 +20,7 @@ synchronously without a real serial port.
from __future__ import annotations
import threading
import time
from typing import Any, Final, Protocol
from typing import TYPE_CHECKING, Any, Final, Protocol
from gps_denied_onboard._types.fc import (
AttitudeSample,
@@ -34,10 +33,14 @@ from gps_denied_onboard._types.fc import (
TelemetryKind,
)
from gps_denied_onboard._types.geo import LatLonAlt
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard.components.c8_fc_adapter._subscription import SubscriptionBus
from gps_denied_onboard.components.c8_fc_adapter._telemetry_rings import TelemetryRing
from gps_denied_onboard.logging import get_logger
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
__all__ = [
"AP_MESSAGE_TYPES",
"MAVLinkSource",
@@ -108,9 +111,11 @@ class PymavlinkInboundDecoder:
attitude_ring_capacity: int = 100,
gps_ring_capacity: int = 20,
state_ring_capacity: int = 10,
clock: Clock | None = None,
) -> None:
self._source = source
self._bus = bus
self._clock: Clock = clock if clock is not None else WallClock()
self._log = get_logger("c8_fc_adapter.inbound_mavlink")
self.imu_ring: TelemetryRing[FcTelemetryFrame] = TelemetryRing(
imu_ring_capacity, kind_name="imu"
@@ -218,7 +223,7 @@ class PymavlinkInboundDecoder:
status = self._map_fix_type(fix_type)
if status is GpsStatus.STABLE:
status = self._maybe_promote_to_spoofed_or_non_spoofed()
captured_at = time.monotonic_ns()
captured_at = self._clock.monotonic_ns()
payload = GpsHealth(status=status, fix_age_ms=0, captured_at=captured_at)
# AC-5.1: cache warm-start hint on first 3D+ fix.
if fix_type >= 3:
@@ -232,7 +237,7 @@ class PymavlinkInboundDecoder:
return self._dispatch(TelemetryKind.GPS_HEALTH, payload, ring=self.gps_ring)
def _handle_heartbeat(self, msg: Any) -> bool:
captured_at = time.monotonic_ns()
captured_at = self._clock.monotonic_ns()
state = self._map_mav_state(
system_status=int(msg.system_status),
base_mode=int(msg.base_mode),
@@ -257,7 +262,7 @@ class PymavlinkInboundDecoder:
text = text.decode("utf-8", errors="replace")
if not any(sentinel.lower() in text.lower() for sentinel in _SPOOFING_SENTINELS):
return
captured_at = time.monotonic_ns()
captured_at = self._clock.monotonic_ns()
with self._lock:
self._spoof_sentinel_seen_at = captured_at
self._log.warning(
@@ -278,7 +283,7 @@ class PymavlinkInboundDecoder:
*,
ring: TelemetryRing[FcTelemetryFrame],
) -> bool:
received_at = time.monotonic_ns()
received_at = self._clock.monotonic_ns()
last = self._last_ts_ns.get(kind)
if last is not None and received_at <= last:
self._log.warning(
@@ -329,7 +334,7 @@ class PymavlinkInboundDecoder:
sentinel_at = self._spoof_sentinel_seen_at
if sentinel_at is None:
return GpsStatus.STABLE
now = time.monotonic_ns()
now = self._clock.monotonic_ns()
if (now - sentinel_at) <= 5 * 1_000_000_000:
return GpsStatus.SPOOFED
return GpsStatus.STABLE
@@ -17,8 +17,7 @@ Tests drive the decoder via :meth:`feed_one_tick` which calls the
from __future__ import annotations
import threading
import time
from typing import Any, Final, Protocol
from typing import TYPE_CHECKING, Any, Final, Protocol
from gps_denied_onboard._types.fc import (
AttitudeSample,
@@ -31,10 +30,14 @@ from gps_denied_onboard._types.fc import (
TelemetryKind,
)
from gps_denied_onboard._types.geo import LatLonAlt
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard.components.c8_fc_adapter._subscription import SubscriptionBus
from gps_denied_onboard.components.c8_fc_adapter._telemetry_rings import TelemetryRing
from gps_denied_onboard.logging import get_logger
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
__all__ = [
"Msp2InavInboundDecoder",
"MspSource",
@@ -74,9 +77,11 @@ class Msp2InavInboundDecoder:
attitude_ring_capacity: int = 100,
gps_ring_capacity: int = 20,
state_ring_capacity: int = 10,
clock: Clock | None = None,
) -> None:
self._source = source
self._bus = bus
self._clock: Clock = clock if clock is not None else WallClock()
self._log = get_logger("c8_fc_adapter.inbound_msp2")
self.imu_ring: TelemetryRing[FcTelemetryFrame] = TelemetryRing(
imu_ring_capacity, kind_name="imu"
@@ -118,10 +123,16 @@ class Msp2InavInboundDecoder:
return dispatched
def run_poll_loop(self, *, period_s: float = 0.01) -> None:
"""Continuous polling loop; honours :meth:`stop`."""
"""Continuous polling loop; honours :meth:`stop`.
Sleeps via the injected :class:`Clock` so replay binaries (which
wire a ``TlogDerivedClock``) advance instantly while the live
binary blocks for ``period_s`` between ticks.
"""
period_ns = int(period_s * 1_000_000_000)
while not self._stop_flag.is_set():
self.feed_one_tick()
time.sleep(period_s)
self._clock.sleep_until_ns(self._clock.monotonic_ns() + period_ns)
def stop(self) -> None:
self._stop_flag.set()
@@ -142,7 +153,7 @@ class Msp2InavInboundDecoder:
raise ValueError(
f"iNav IMU dict shape: expected 3-vectors, got accel={accel}, gyro={gyro}"
)
sensor_ts_ns = time.monotonic_ns()
sensor_ts_ns = self._clock.monotonic_ns()
payload = ImuTelemetrySample(ts_ns=sensor_ts_ns, accel_xyz=accel, gyro_xyz=gyro)
return self._dispatch(TelemetryKind.IMU_SAMPLE, payload, ring=self.imu_ring)
@@ -157,7 +168,7 @@ class Msp2InavInboundDecoder:
roll_rad = float(raw["angx"]) * (3.141592653589793 / 180.0)
pitch_rad = float(raw["angy"]) * (3.141592653589793 / 180.0)
yaw_rad = float(raw["heading"]) * (3.141592653589793 / 180.0)
sensor_ts_ns = time.monotonic_ns()
sensor_ts_ns = self._clock.monotonic_ns()
payload = AttitudeSample(
ts_ns=sensor_ts_ns,
roll_rad=roll_rad,
@@ -180,7 +191,7 @@ class Msp2InavInboundDecoder:
status = GpsStatus.DEGRADED
else:
status = GpsStatus.STABLE
captured_at = time.monotonic_ns()
captured_at = self._clock.monotonic_ns()
if fix >= 2:
lat_deg = float(raw["lat"]) / 1e7
lon_deg = float(raw["lon"]) / 1e7
@@ -198,7 +209,7 @@ class Msp2InavInboundDecoder:
return False
# iNav flight-state dict shape (subset we honour):
# 'armed': bool, 'in_flight': bool, 'failsafe': bool
captured_at = time.monotonic_ns()
captured_at = self._clock.monotonic_ns()
if raw.get("failsafe", False):
state = FlightState.FAILED
elif raw.get("in_flight", False):
@@ -233,7 +244,7 @@ class Msp2InavInboundDecoder:
*,
ring: TelemetryRing[FcTelemetryFrame],
) -> bool:
received_at = time.monotonic_ns()
received_at = self._clock.monotonic_ns()
last = self._last_ts_ns.get(kind)
if last is not None and received_at <= last:
self._log.warning(
@@ -24,10 +24,9 @@ Build flag: ``BUILD_GCS_QGC_MAVLINK``.
from __future__ import annotations
import threading
import time
from collections.abc import Callable
from datetime import datetime, timezone
from typing import Any, Final
from typing import TYPE_CHECKING, Any, Final
from gps_denied_onboard._types.fc import (
FcKind,
@@ -39,9 +38,13 @@ from gps_denied_onboard._types.fc import (
)
from gps_denied_onboard._types.geo import LatLonAlt
from gps_denied_onboard._types.state import EstimatorOutput
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard.components.c8_fc_adapter._covariance_projector import (
CovarianceProjector,
)
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c8_fc_adapter._subscription import SubscriptionBus
from gps_denied_onboard.components.c8_fc_adapter.errors import (
GcsAdapterConfigError,
@@ -110,14 +113,14 @@ class QgcTelemetryAdapter:
wgs_converter: Any,
covariance_projector: CovarianceProjector,
fdr_client: FdrClient,
clock: Callable[[], float] = time.monotonic,
clock: Clock | None = None,
connect_factory: Callable[[str, int], Any] | None = None,
) -> None:
self._config = config
self._wgs_converter = wgs_converter
self._cov_projector = covariance_projector
self._fdr_client = fdr_client
self._clock = clock
self._clock: Clock = clock if clock is not None else WallClock()
self._connect_factory = connect_factory
self._log = get_logger("c8_gcs_adapter.qgc")
# The modulo divisor — computed once at construction so unit
@@ -333,7 +336,7 @@ class QgcTelemetryAdapter:
return OperatorCommand(
command=msg_type,
payload=payload,
received_at=time.monotonic_ns(),
received_at=self._clock.monotonic_ns(),
)
def _record_operator_command_fdr(self, cmd: OperatorCommand, msg: Any) -> None:
@@ -374,4 +377,4 @@ class QgcTelemetryAdapter:
return wgs
def _clock_ms_boot(self) -> int:
return int(self._clock() * 1_000)
return self._clock.monotonic_ns() // 1_000_000
@@ -13,9 +13,8 @@ Build flag: ``BUILD_FC_INAV``.
from __future__ import annotations
import threading
import time
from collections.abc import Callable
from typing import Any, Final
from typing import TYPE_CHECKING, Any, Final
from gps_denied_onboard._types.emitted import EmittedExternalPosition
from gps_denied_onboard._types.fc import (
@@ -29,9 +28,13 @@ from gps_denied_onboard._types.fc import (
)
from gps_denied_onboard._types.geo import LatLonAlt
from gps_denied_onboard._types.state import EstimatorOutput
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard.components.c8_fc_adapter._covariance_projector import (
CovarianceProjector,
)
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c8_fc_adapter._msp2_sensor_gps_encoder import (
MSP2_SENSOR_GPS_CODE,
encode_msp2_sensor_gps,
@@ -71,7 +74,7 @@ class Msp2InavAdapter:
wgs_converter: Any,
covariance_projector: CovarianceProjector,
fdr_client: FdrClient,
clock: Callable[[], float] = time.monotonic,
clock: Clock | None = None,
msp_connect_factory: Callable[[str, int], Any] | None = None,
secondary_mavlink_factory: Callable[[], Any] | None = None,
) -> None:
@@ -79,7 +82,7 @@ class Msp2InavAdapter:
self._wgs_converter = wgs_converter
self._cov_projector = covariance_projector
self._fdr_client = fdr_client
self._clock = clock
self._clock: Clock = clock if clock is not None else WallClock()
self._msp_connect_factory = msp_connect_factory
self._secondary_mavlink_factory = secondary_mavlink_factory
self._log = get_logger("c8_fc_adapter.inav_adapter")
@@ -94,10 +97,12 @@ class Msp2InavAdapter:
# polling decoder lands in AZ-391; the per-adapter inbound
# composition happens in a follow-up batch).
self._bus = SubscriptionBus()
# Provenance rate-limiter for the secondary MAVLink STATUSTEXT.
# Provenance rate-limiter for the secondary MAVLink STATUSTEXT;
# the limiter expects a float-seconds clock, so we wrap the
# injected Clock's ns reading.
self._provenance = StatusTextTransitionRateLimiter(
send_statustext=self._send_statustext_secondary,
clock=time.monotonic,
clock=lambda: self._clock.monotonic_ns() / 1_000_000_000,
)
# ------------------------------------------------------------------
@@ -165,7 +170,7 @@ class Msp2InavAdapter:
raise FcEmitError("smoothed output cannot be emitted to FC (Invariant 6)")
h_pos_accuracy_mm = self._cov_projector.to_inav_h_pos_accuracy_mm(output)
wgs = self._extract_wgs84(output)
emitted_at = time.monotonic_ns()
emitted_at = self._clock.monotonic_ns()
self._sequence_number = (self._sequence_number + 1) & 0xFF
seq = self._sequence_number
payload = encode_msp2_sensor_gps(
@@ -227,7 +232,7 @@ class Msp2InavAdapter:
state=FlightState.INIT,
last_valid_gps_hint_wgs84=None,
last_valid_gps_age_ms=None,
captured_at=time.monotonic_ns(),
captured_at=self._clock.monotonic_ns(),
)
# ------------------------------------------------------------------
@@ -22,10 +22,9 @@ from __future__ import annotations
import os
import secrets
import threading
import time
from collections.abc import Callable
from datetime import datetime, timezone
from typing import Any, Final
from typing import TYPE_CHECKING, Any, Final
from gps_denied_onboard._types.emitted import EmittedExternalPosition
from gps_denied_onboard._types.fc import (
@@ -39,9 +38,13 @@ from gps_denied_onboard._types.fc import (
)
from gps_denied_onboard._types.geo import LatLonAlt
from gps_denied_onboard._types.state import EstimatorOutput
from gps_denied_onboard.clock.wall_clock import WallClock
from gps_denied_onboard.components.c8_fc_adapter._covariance_projector import (
CovarianceProjector,
)
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c8_fc_adapter._inbound_mavlink import (
PymavlinkInboundDecoder,
)
@@ -94,7 +97,7 @@ class PymavlinkArdupilotAdapter:
wgs_converter: Any,
covariance_projector: CovarianceProjector,
fdr_client: FdrClient,
clock: Callable[[], float] = time.monotonic,
clock: Clock | None = None,
flight_id: str = "",
connect_factory: Callable[[str, int], Any] | None = None,
) -> None:
@@ -102,7 +105,7 @@ class PymavlinkArdupilotAdapter:
self._wgs_converter = wgs_converter
self._cov_projector = covariance_projector
self._fdr_client = fdr_client
self._clock = clock
self._clock: Clock = clock if clock is not None else WallClock()
self._flight_id = flight_id
self._connect_factory = connect_factory
self._signing_failure_threshold = max(1, int(config.fc.signing_failure_threshold))
@@ -122,10 +125,11 @@ class PymavlinkArdupilotAdapter:
self._bus = SubscriptionBus()
self._inbound: PymavlinkInboundDecoder | None = None
self._inbound_thread: threading.Thread | None = None
# Outbound provenance rate limiter.
# Outbound provenance rate limiter; wraps the injected Clock as a
# float-seconds callable (the limiter's existing API contract).
self._provenance = StatusTextTransitionRateLimiter(
send_statustext=self._send_statustext_internal,
clock=time.monotonic,
clock=self._monotonic_s,
)
# ------------------------------------------------------------------
@@ -226,7 +230,7 @@ class PymavlinkArdupilotAdapter:
raise FcEmitError("smoothed output cannot be emitted to FC (Invariant 6)")
horiz_accuracy_m = self._cov_projector.to_ardupilot_horiz_accuracy_m(output)
wgs = self._extract_wgs84(output)
emitted_at = time.monotonic_ns()
emitted_at = self._clock.monotonic_ns()
self._sequence_number += 1
seq = self._sequence_number
try:
@@ -312,7 +316,7 @@ class PymavlinkArdupilotAdapter:
if not self._opened or self._connection is None:
raise FcEmitError("adapter not opened")
self._enforce_single_writer()
now_ns = time.monotonic_ns()
now_ns = self._clock.monotonic_ns()
if self._last_switch_attempt_ns:
elapsed_s = (now_ns - self._last_switch_attempt_ns) / 1_000_000_000
if elapsed_s < _SWITCH_RATE_LIMIT_S:
@@ -388,7 +392,7 @@ class PymavlinkArdupilotAdapter:
state=FlightState.INIT,
last_valid_gps_hint_wgs84=None,
last_valid_gps_age_ms=None,
captured_at=time.monotonic_ns(),
captured_at=self._clock.monotonic_ns(),
)
payload = latest.payload
assert isinstance(payload, FlightStateSignal)
@@ -542,9 +546,9 @@ class PymavlinkArdupilotAdapter:
Returns the ACK message on match, or ``None`` on timeout. Other
COMMAND_ACK messages (for unrelated commands) are ignored.
"""
deadline = self._clock() + (timeout_ms / 1000.0)
deadline = self._monotonic_s() + (timeout_ms / 1000.0)
while True:
remaining = deadline - self._clock()
remaining = deadline - self._monotonic_s()
if remaining <= 0:
return None
try:
@@ -608,11 +612,14 @@ class PymavlinkArdupilotAdapter:
)
return wgs
def _monotonic_s(self) -> float:
return self._clock.monotonic_ns() / 1_000_000_000
def _clock_us(self) -> int:
return int(self._clock() * 1_000_000)
return self._clock.monotonic_ns() // 1_000
def _clock_ms_boot(self) -> int:
return int(self._clock() * 1_000)
return self._clock.monotonic_ns() // 1_000_000
def _fdr_signing_event(self, *, kind: str, kv: dict[str, Any]) -> None:
record = FdrRecord(
@@ -40,6 +40,13 @@ KNOWN_PAYLOAD_KEYS: Final[dict[str, frozenset[str]]] = {
"vio.tick": frozenset(
{"frame_id", "R", "t", "P", "last_anchor_age_ms", "mre_px", "imu_bias_norm"}
),
# AZ-332 / E-C1: emitted on every VioStrategy state transition
# (INIT->TRACKING->DEGRADED->LOST etc.). One record per transition;
# steady-state frames emit nothing on this kind. `frame_id` is the
# frame the transition was decided on (may be empty for INIT->...).
"vio.health": frozenset(
{"state", "consecutive_lost", "bias_norm", "strategy_label", "frame_id"}
),
"state.tick": frozenset({"frame_id", "fused_pose", "covariance_2x2", "estimator_label"}),
"tile_match": frozenset({"frame_id", "tile_id", "score", "match_count", "ransac_inliers"}),
"overrun": frozenset({"producer_id", "dropped_count"}),
@@ -1,9 +1,21 @@
"""FrameSource interface + concrete implementations.
"""``FrameSource`` cross-cutting interface — public surface (AZ-398 v1.0.0).
The interface is bootstrap-stubbed here. `LiveCameraFrameSource` and
`VideoFileFrameSource` are owned by AZ-398.
Per AC-9, this module re-exports the Protocol and the error family
ONLY. Concrete strategies (``LiveCameraFrameSource``,
``VideoFileFrameSource``) live in their own modules and are imported
LAZILY by ``runtime_root.frame_source_factory.build_frame_source``;
this keeps the lazy-import boundary explicit and lets Tier-0 builds
omit the OpenCV runtime entirely.
"""
from gps_denied_onboard.frame_source.errors import (
FrameSourceConfigError,
FrameSourceError,
)
from gps_denied_onboard.frame_source.interface import FrameSource
__all__ = ["FrameSource"]
__all__ = [
"FrameSource",
"FrameSourceConfigError",
"FrameSourceError",
]
@@ -0,0 +1,48 @@
"""``FrameSource`` error taxonomy (AZ-398 v1.0.0).
Per the replay contract
(``_docs/02_document/contracts/replay/replay_protocol.md``), every
transient I/O failure on the camera path MUST surface as
:class:`FrameSourceError` (Invariant 4 replay must be deterministic,
silent ``None`` drops are forbidden).
The two-class hierarchy mirrors the C6/C7/C1 component taxonomies:
- :class:`FrameSourceError` operational failures during streaming
(decode error, device disconnect, out-of-order frame).
- :class:`FrameSourceConfigError` composition-time failures (build
flag OFF, missing dependency, invalid config).
"""
from __future__ import annotations
class FrameSourceError(RuntimeError):
"""Transient or fatal failure during frame ingestion.
Examples:
- A corrupt H.264 keyframe in the replay video file.
- An ordering violation: ``next_frame()`` returned a frame whose
``monotonic_ns`` is < the previous frame's (Invariant 3).
- A USB camera disconnect mid-flight (live source).
The error message MUST identify the frame index or timestamp where
the failure occurred so the operator can correlate against the
upstream recording.
"""
class FrameSourceConfigError(RuntimeError):
"""Composition-time configuration failure for a frame source.
Examples:
- ``BUILD_VIDEO_FILE_FRAME_SOURCE=OFF`` and the binary tried to
construct :class:`VideoFileFrameSource`.
- The configured video path does not exist or is not readable.
- OpenCV is not importable (Tier-0 / docker-minimal build).
"""
__all__ = ["FrameSourceError", "FrameSourceConfigError"]
@@ -1,18 +1,62 @@
"""`FrameSource` Protocol.
"""``FrameSource`` Protocol — public Layer 1 cross-cutting interface (AZ-398 v1.0.0).
Owned by AZ-398 (E-DEMO-REPLAY) for the formalisation; bootstrap ships the
interface stub so C1 can be constructor-injected against it.
Frozen per ``_docs/02_document/contracts/replay/replay_protocol.md``.
Two strategies implement this Protocol:
- :class:`LiveCameraFrameSource` the formalised live camera ingest
path (gated ``BUILD_LIVE_CAMERA_FRAME_SOURCE``).
- :class:`VideoFileFrameSource` the replay-only file decoder (gated
``BUILD_VIDEO_FILE_FRAME_SOURCE``).
Consumers (C1 :class:`VioStrategy`) accept a :class:`FrameSource` via
constructor injection so production code stays mode-agnostic
(Invariant 1).
"""
from __future__ import annotations
from collections.abc import Iterator
from typing import Protocol
from typing import TYPE_CHECKING, Protocol, runtime_checkable
from gps_denied_onboard._types.nav import NavCameraFrame
if TYPE_CHECKING:
from gps_denied_onboard._types.nav import NavCameraFrame
@runtime_checkable
class FrameSource(Protocol):
"""A source of `NavCameraFrame` instances."""
"""A pluggable camera-frame producer.
def frames(self) -> Iterator[NavCameraFrame]: ...
The Protocol exposes two methods and one ordering invariant:
- :meth:`next_frame` returns the next :class:`NavCameraFrame` (with
``metadata["monotonic_ns"]`` set by the strategy from its
injected :class:`Clock`) or ``None`` ONLY when the stream is
permanently exhausted (Invariant 4).
- Consecutive ``next_frame()`` returns MUST have non-decreasing
``metadata["monotonic_ns"]`` (Invariant 3); out-of-order frames
raise :class:`FrameSourceError`.
- :meth:`close` releases the underlying capture handle and is
idempotent (AC-10).
"""
def next_frame(self) -> "NavCameraFrame | None":
"""Return the next frame, ``None`` on end-of-stream.
Transient I/O failures (decode error, disconnect) MUST raise
:class:`FrameSourceError` never return ``None`` silently
(Invariant 4). After ``None`` has been returned once, every
subsequent call MUST also return ``None`` (idempotent EOS).
"""
...
def close(self) -> None:
"""Release the underlying capture handle.
Idempotent: a second call is a no-op (AC-10); the strategy
SHOULD log a DEBUG line on the second call so a debug trace
can prove no double-free occurred.
"""
...
__all__ = ["FrameSource"]
@@ -0,0 +1,161 @@
"""``LiveCameraFrameSource`` — live nav-camera ingest (AZ-398).
Wraps :class:`cv2.VideoCapture` against an integer device index (the
USB / CSI camera bound at boot by the airborne / research / operator
binaries). The strategy is intentionally minimal: each
:meth:`next_frame` call performs one blocking ``capture.read()`` and
returns the freshest frame; no dedicated decode thread, no ring
buffer. C1 (the only consumer) drives the loop at its target
rate, and a blocking read is the simplest way to apply backpressure.
Gated by ``BUILD_LIVE_CAMERA_FRAME_SOURCE`` (Invariant 9). The flag is
``ON`` for live / research / operator / replay binaries and ``OFF``
only for unit tests that need to construct a substitute without
touching a real camera.
"""
from __future__ import annotations
import logging
import os
from datetime import datetime, timezone
from typing import TYPE_CHECKING, Any
from gps_denied_onboard.frame_source.errors import (
FrameSourceConfigError,
FrameSourceError,
)
if TYPE_CHECKING:
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard.clock import Clock
_BUILD_FLAG = "BUILD_LIVE_CAMERA_FRAME_SOURCE"
_logger = logging.getLogger(__name__)
def _build_flag_on() -> bool:
raw = os.environ.get(_BUILD_FLAG, "")
return raw.strip().lower() in {"on", "1", "true", "yes"}
class LiveCameraFrameSource:
"""Live :class:`FrameSource` strategy backed by ``cv2.VideoCapture``.
Constructor parameters:
- ``device_index`` integer index passed to ``cv2.VideoCapture``;
typically ``0`` for the first attached camera.
- ``camera_calibration_id`` string identifier baked into every
emitted frame (matches the intrinsics file shipped with the
binary).
- ``clock`` injected :class:`Clock`; supplies the per-frame
``monotonic_ns`` ordering key and the wall-clock timestamp.
"""
__slots__ = (
"_device_index",
"_camera_calibration_id",
"_clock",
"_capture",
"_frame_counter",
"_last_monotonic_ns",
"_closed",
)
def __init__(
self,
*,
device_index: int,
camera_calibration_id: str,
clock: "Clock",
) -> None:
if not _build_flag_on():
raise FrameSourceConfigError(
f"{_BUILD_FLAG} is OFF in this binary; "
"LiveCameraFrameSource is unavailable."
)
try:
import cv2 as _cv2
except ImportError as exc:
raise FrameSourceConfigError(
"LiveCameraFrameSource requires opencv-python; not "
"importable in this binary."
) from exc
capture = _cv2.VideoCapture(device_index)
if not capture.isOpened():
capture.release()
raise FrameSourceConfigError(
f"LiveCameraFrameSource: cv2.VideoCapture could not open "
f"device index {device_index}"
)
self._device_index = device_index
self._camera_calibration_id = camera_calibration_id
self._clock = clock
self._capture = capture
self._frame_counter = 0
self._last_monotonic_ns = -1
self._closed = False
def next_frame(self) -> "NavCameraFrame | None":
from gps_denied_onboard._types.nav import NavCameraFrame
if self._closed:
return None
ok, image = self._capture.read()
if not ok or image is None:
# Live camera: a failed read is a transient error (USB
# glitch, driver hiccup). Invariant 4 requires raising,
# not returning None — the only legitimate None is EOS,
# and a live camera never EOSes.
raise FrameSourceError(
f"LiveCameraFrameSource: cv2.VideoCapture.read failed at "
f"frame {self._frame_counter} (device "
f"{self._device_index})"
)
monotonic_ns = self._clock.monotonic_ns()
if monotonic_ns < self._last_monotonic_ns:
raise FrameSourceError(
f"LiveCameraFrameSource: clock went backwards at frame "
f"{self._frame_counter}: {monotonic_ns} ns followed "
f"{self._last_monotonic_ns} ns (Invariant 3)"
)
timestamp = datetime.fromtimestamp(
self._clock.time_ns() / 1e9, tz=timezone.utc
)
metadata: dict[str, Any] = {
"monotonic_ns": monotonic_ns,
"source": "live_camera",
"device_index": self._device_index,
}
frame = NavCameraFrame(
frame_id=self._frame_counter,
timestamp=timestamp,
image=image,
camera_calibration_id=self._camera_calibration_id,
metadata=metadata,
)
self._frame_counter += 1
self._last_monotonic_ns = monotonic_ns
return frame
def close(self) -> None:
if self._closed:
_logger.debug(
"LiveCameraFrameSource(device=%s) close called twice; no-op",
self._device_index,
)
return
self._closed = True
try:
self._capture.release()
except Exception: # pragma: no cover — defensive.
_logger.exception(
"LiveCameraFrameSource(device=%s) cv2.release() raised",
self._device_index,
)
__all__ = ["LiveCameraFrameSource"]
@@ -0,0 +1,199 @@
"""``VideoFileFrameSource`` — replay-only file decoder (AZ-398).
Streams an MP4 / MKV / AVI file frame-by-frame via OpenCV's
:class:`cv2.VideoCapture`. Each emitted :class:`NavCameraFrame`
carries:
- ``frame_id`` a strictly-increasing counter starting at 0.
- ``timestamp`` UTC wall-clock at decode time (from the injected
:class:`Clock`); the file's own pts is NOT used for this field
because replay deterministically remaps it.
- ``image`` the decoded BGR ``numpy.ndarray`` (OpenCV native order).
- ``metadata["monotonic_ns"]`` the injected :class:`Clock`'s
``monotonic_ns()`` at decode time. This is the value AC-2 asserts
non-decreasing.
- ``metadata["source_pts_ns"]`` the file's per-frame PTS in ns (the
``CAP_PROP_POS_MSEC`` reading × 1e6) for downstream determinism.
Gated by ``BUILD_VIDEO_FILE_FRAME_SOURCE`` (Invariant 9). The check is
performed at constructor entry a Tier-0 build that imports this
module by accident still raises ``FrameSourceConfigError`` cleanly
without attempting an OpenCV import.
"""
from __future__ import annotations
import logging
import os
from datetime import datetime, timezone
from pathlib import Path
from typing import TYPE_CHECKING, Any
from gps_denied_onboard.frame_source.errors import (
FrameSourceConfigError,
FrameSourceError,
)
if TYPE_CHECKING:
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard.clock import Clock
_BUILD_FLAG = "BUILD_VIDEO_FILE_FRAME_SOURCE"
_logger = logging.getLogger(__name__)
def _build_flag_on() -> bool:
"""``ON`` / ``1`` / ``true`` / ``yes`` (case-insensitive) → ``True``."""
raw = os.environ.get(_BUILD_FLAG, "")
return raw.strip().lower() in {"on", "1", "true", "yes"}
class VideoFileFrameSource:
"""Replay :class:`FrameSource` strategy backed by ``cv2.VideoCapture``.
Stream-decodes a video file; per-frame decode is amortised by
OpenCV's internal buffer. The strategy preserves the file's frame
order there is no seek, no random-access path; this keeps
replay deterministic (Invariant 10).
Constructor parameters:
- ``path`` filesystem path to an MP4/MKV/AVI (existence checked
at construction).
- ``camera_calibration_id`` string identifier propagated into
every emitted :class:`NavCameraFrame` so downstream consumers
(C1, C3, C4) load the correct intrinsics for the recording.
- ``clock`` injected :class:`Clock`; the strategy reads
``clock.monotonic_ns()`` per emitted frame for the
``metadata["monotonic_ns"]`` ordering field.
"""
__slots__ = (
"_path",
"_camera_calibration_id",
"_clock",
"_capture",
"_frame_counter",
"_last_monotonic_ns",
"_closed",
"_eos_returned",
)
def __init__(
self,
*,
path: Path | str,
camera_calibration_id: str,
clock: "Clock",
) -> None:
if not _build_flag_on():
raise FrameSourceConfigError(
f"{_BUILD_FLAG} is OFF in this binary; "
"VideoFileFrameSource is unavailable. Rebuild with the "
"flag set to ON in the replay binary's Dockerfile."
)
resolved = Path(path)
if not resolved.is_file():
raise FrameSourceConfigError(
f"VideoFileFrameSource: path {resolved!s} does not exist "
"or is not a regular file."
)
try:
import cv2 as _cv2
except ImportError as exc:
raise FrameSourceConfigError(
"VideoFileFrameSource requires opencv-python; not "
"importable in this binary."
) from exc
capture = _cv2.VideoCapture(str(resolved))
if not capture.isOpened():
capture.release()
raise FrameSourceConfigError(
f"VideoFileFrameSource: cv2.VideoCapture could not open "
f"{resolved!s} (unsupported codec or corrupt header)."
)
self._path = resolved
self._camera_calibration_id = camera_calibration_id
self._clock = clock
self._capture = capture
self._frame_counter = 0
self._last_monotonic_ns = -1
self._closed = False
self._eos_returned = False
def next_frame(self) -> "NavCameraFrame | None":
from gps_denied_onboard._types.nav import NavCameraFrame
if self._closed or self._eos_returned:
return None
try:
import cv2 as _cv2
except ImportError as exc: # pragma: no cover — established at __init__.
raise FrameSourceError(
"VideoFileFrameSource: opencv-python disappeared between "
"construction and next_frame()"
) from exc
ok, image = self._capture.read()
if not ok:
self._eos_returned = True
return None
if image is None:
# OpenCV's read() returning ok=True with image=None signals a
# decoder-internal failure for the current frame; treat as a
# transient error per Invariant 4 rather than silently
# advancing.
raise FrameSourceError(
f"VideoFileFrameSource: video decode failed at frame "
f"{self._frame_counter} (cv2.VideoCapture.read returned "
"ok=True with image=None)"
)
monotonic_ns = self._clock.monotonic_ns()
if monotonic_ns < self._last_monotonic_ns:
raise FrameSourceError(
f"VideoFileFrameSource: clock went backwards at frame "
f"{self._frame_counter}: {monotonic_ns} ns followed "
f"{self._last_monotonic_ns} ns (Invariant 3)"
)
pos_msec = float(self._capture.get(_cv2.CAP_PROP_POS_MSEC))
source_pts_ns = int(pos_msec * 1_000_000)
timestamp = datetime.fromtimestamp(
self._clock.time_ns() / 1e9, tz=timezone.utc
)
metadata: dict[str, Any] = {
"monotonic_ns": monotonic_ns,
"source_pts_ns": source_pts_ns,
"source": "video_file",
}
frame = NavCameraFrame(
frame_id=self._frame_counter,
timestamp=timestamp,
image=image,
camera_calibration_id=self._camera_calibration_id,
metadata=metadata,
)
self._frame_counter += 1
self._last_monotonic_ns = monotonic_ns
return frame
def close(self) -> None:
if self._closed:
_logger.debug(
"VideoFileFrameSource(%s) close called twice; no-op",
self._path,
)
return
self._closed = True
try:
self._capture.release()
except Exception: # pragma: no cover — defensive.
# cv2.VideoCapture.release should never raise; if it does on
# an exotic backend, we still want to flag the source as
# closed so a second close() stays a no-op.
_logger.exception(
"VideoFileFrameSource(%s) cv2.release() raised", self._path
)
__all__ = ["VideoFileFrameSource"]
+1 -1
View File
@@ -23,12 +23,12 @@ def check() -> int:
from gps_denied_onboard._types import ( # noqa: F401
calibration,
emitted,
inference,
manifests,
matching,
nav,
pose,
tile,
vio,
vpr,
)
from gps_denied_onboard.logging import get_logger # noqa: F401
@@ -0,0 +1,159 @@
"""`FeatureExtractor` — shared image → :class:`KeypointSet` helper (AZ-343 scope expansion).
L1 helper analogous to :mod:`gps_denied_onboard.helpers.lightglue_runtime`
and :mod:`gps_denied_onboard.helpers.ransac_filter`. Produces a
:class:`gps_denied_onboard._types.matching.KeypointSet` (the same
DTO that :class:`LightGlueRuntime.match` consumes) from a raw BGR
image.
Why a shared helper:
- C2.5 :class:`InlierCountReRanker` (AZ-343) consumes one
:class:`FeatureExtractor` instance to extract features from each
per-frame nav-camera image AND from each candidate tile's JPEG
bytes. The same instance MUST produce comparable feature sets
for both inputs otherwise the LightGlue inlier count would
collapse to noise.
- A future C3 backbone that wants to share keypoints with C2.5
(rather than re-extracting them) can read the same handle from
the composition root, mirroring the
:class:`LightGlueRuntime` ownership pattern (R14 fix).
Concrete impls:
- :class:`OpenCvOrbExtractor`: CPU, deterministic, placeholder used
by tests and by the airborne binary until the C7
:class:`InferenceRuntime`-backed DISK / ALIKED extractor lands.
ORB returns binary (``uint8``) descriptors of 32 bytes; we
convert to ``float32`` per the
:class:`gps_denied_onboard._types.matching.KeypointSet` contract.
- Future: TensorRT-backed DISK / ALIKED extractor; consumes
:class:`InferenceRuntime` from C7.
This helper is intentionally L1 it imports only ``numpy`` and
``cv2`` plus the L1 :class:`KeypointSet` DTO. Concrete strategies
that need GPU backbones live in their own modules and accept the
:class:`InferenceRuntime` via constructor injection.
"""
from __future__ import annotations
from typing import Protocol, runtime_checkable
import cv2
import numpy as np
from gps_denied_onboard._types.matching import KeypointSet
__all__ = [
"FeatureExtractor",
"FeatureExtractorError",
"OpenCvOrbExtractor",
]
# ORB descriptors are 32 bytes (256 bits). LightGlue's KeypointSet
# requires float32 descriptors so we widen ORB's uint8 output. This is
# a placeholder choice; production will swap in DISK/ALIKED (128-d
# float32) via the C7 InferenceRuntime path.
_ORB_DESCRIPTOR_BYTES = 32
_ORB_FLOAT_DESCRIPTOR_DIM = _ORB_DESCRIPTOR_BYTES * 8 # 256-d float32
class FeatureExtractorError(RuntimeError):
"""Raised on extractor construction or per-image failure."""
@runtime_checkable
class FeatureExtractor(Protocol):
"""Image → :class:`KeypointSet` Protocol.
Implementations are constructor-injected by the composition root
and shared across consumers (e.g., C2.5 :class:`InlierCountReRanker`
uses one instance for both query frames and tile pixels).
Invariants:
- :meth:`extract` returns a :class:`KeypointSet` whose
``descriptors.shape[1] == self.descriptor_dim()``.
- ``keypoints`` is shape ``(N, 2)`` ``float32`` pixel coordinates.
- ``descriptors`` is shape ``(N, descriptor_dim)`` ``float32``.
- Empty inputs (zero keypoints detected) return an empty-but-shaped
:class:`KeypointSet` (``N == 0``) rather than raising the
C2.5 strategy treats zero-feature candidates as drop events.
- Deterministic for fixed inputs (no internal RNG state).
"""
def extract(self, image_bgr: np.ndarray) -> KeypointSet:
"""Detect keypoints + compute descriptors on a single image."""
...
def descriptor_dim(self) -> int:
"""Return the dim of every descriptor row produced by :meth:`extract`."""
...
class OpenCvOrbExtractor:
"""CPU :class:`FeatureExtractor` backed by ``cv2.ORB_create``.
Placeholder implementation: ORB is fast (~5 ms / 480p image on a
modern CPU) and stable enough to exercise the C2.5 strategy's
orchestration logic, but its uint8 binary descriptors are NOT a
drop-in for LightGlue-trained DISK/ALIKED features. Production
deployments MUST replace this extractor with a deep-learning
backbone before flight (tracked under the future C2.5
backbone-extractor task).
The ``nfeatures`` constructor arg caps the number of keypoints
per image; default 1024 mirrors typical DISK / ALIKED budgets.
"""
def __init__(self, *, nfeatures: int = 1024) -> None:
if nfeatures < 1:
raise FeatureExtractorError(
f"OpenCvOrbExtractor.nfeatures must be >= 1; got {nfeatures}"
)
self._nfeatures: int = nfeatures
# ORB itself is created lazily so test environments without
# a working OpenCV install can still import this module.
# Cached on first call to amortise the per-image cost.
self._orb: cv2.ORB | None = None
def descriptor_dim(self) -> int:
return _ORB_FLOAT_DESCRIPTOR_DIM
def _get_orb(self) -> cv2.ORB:
if self._orb is None:
self._orb = cv2.ORB_create(nfeatures=self._nfeatures)
return self._orb
def extract(self, image_bgr: np.ndarray) -> KeypointSet:
if image_bgr.ndim == 3:
gray = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2GRAY)
elif image_bgr.ndim == 2:
gray = image_bgr
else:
raise FeatureExtractorError(
"image_bgr must be 2-D (gray) or 3-D (BGR); "
f"got ndim={image_bgr.ndim} shape={image_bgr.shape}"
)
if gray.dtype != np.uint8:
gray = gray.astype(np.uint8)
try:
keypoints_cv, descriptors_uint8 = self._get_orb().detectAndCompute(
gray, mask=None
)
except cv2.error as exc:
raise FeatureExtractorError(f"cv2.ORB.detectAndCompute failed: {exc}") from exc
if descriptors_uint8 is None or len(keypoints_cv) == 0:
keypoints = np.zeros((0, 2), dtype=np.float32)
descriptors = np.zeros((0, _ORB_FLOAT_DESCRIPTOR_DIM), dtype=np.float32)
return KeypointSet(keypoints=keypoints, descriptors=descriptors)
keypoints = np.array(
[(kp.pt[0], kp.pt[1]) for kp in keypoints_cv], dtype=np.float32
)
# Expand each 32-byte ORB descriptor to a 256-d float32 vector
# of bit indicators (0/1). Matches the contract that
# ``KeypointSet.descriptors`` is float32.
bits = np.unpackbits(descriptors_uint8, axis=1).astype(np.float32)
return KeypointSet(keypoints=keypoints, descriptors=bits)
@@ -0,0 +1,61 @@
"""Composition-root :class:`Clock` factory (AZ-398).
Composition resolves :class:`Clock` exactly once per process per
Invariant Single Clock per process. Live / research / operator
binaries call :func:`build_clock(kind="wall")`; the replay binary
calls :func:`build_clock(kind="tlog", source=...)` (the replay
composition root, AZ-401, wires the tlog timestamp source).
Concrete strategy modules (``wall_clock``, ``tlog_derived``) live
under :mod:`gps_denied_onboard.clock`; they are imported eagerly here
because the Clock has no Tier-specific runtime dependency and the
selection happens at startup.
"""
from __future__ import annotations
from collections.abc import Callable, Iterable
from typing import TYPE_CHECKING
from gps_denied_onboard.clock.tlog_derived import TlogDerivedClock
from gps_denied_onboard.clock.wall_clock import WallClock
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
__all__ = ["build_clock"]
def build_clock(
*,
kind: str = "wall",
source: Callable[[], int] | Iterable[int] | None = None,
) -> "Clock":
"""Construct the :class:`Clock` strategy for this process.
``kind`` is one of ``"wall"`` (default) or ``"tlog"``. ``source`` is
required when ``kind == "tlog"`` (it carries the tlog parser's
timestamp stream) and forbidden otherwise.
Raises :class:`ValueError` on an unknown ``kind`` or a misconfigured
source neither is recoverable, so failing loudly at composition
time is correct.
"""
if kind == "wall":
if source is not None:
raise ValueError(
"build_clock(kind='wall'): source must be None; "
"WallClock takes no upstream timestamp stream."
)
return WallClock()
if kind == "tlog":
if source is None:
raise ValueError(
"build_clock(kind='tlog'): source is required (the tlog "
"timestamp stream from the replay parser)."
)
return TlogDerivedClock(source)
raise ValueError(
f"build_clock: unknown kind {kind!r}; expected 'wall' or 'tlog'"
)
+13 -1
View File
@@ -2,7 +2,7 @@
These are raised at composition time (``build_*`` factory entry) and
NOT during the running flight. Components own their per-runtime error
families; this module owns the cross-component selection error.
families; this module owns the cross-component selection errors.
"""
from __future__ import annotations
@@ -22,3 +22,15 @@ class RuntimeNotAvailableError(RuntimeError):
The message MUST name the requested runtime label so the operator can
correlate against ``.env``'s ``BUILD_*`` matrix without guessing.
"""
class StrategyNotAvailableError(RuntimeError):
"""Raised when ``build_vio_strategy`` is asked for a VIO strategy whose
compile-time ``BUILD_*`` flag is OFF (AZ-331).
Distinct from :class:`RuntimeNotAvailableError` because the C1
contract names this error type explicitly (AC-5). The message
MUST name both the requested strategy label and the missing
``BUILD_*`` flag so the operator can correlate against the
binary's compile matrix.
"""
@@ -0,0 +1,91 @@
"""Composition-root :class:`FrameSource` factory (AZ-398).
Selects exactly one :class:`FrameSource` strategy per binary based on
the requested ``kind`` and the compile-time ``BUILD_*`` flags. The
concrete strategy modules are imported lazily so a Tier-0 build with
``BUILD_LIVE_CAMERA_FRAME_SOURCE=OFF`` and
``BUILD_VIDEO_FILE_FRAME_SOURCE=OFF`` never pulls OpenCV into
``sys.modules`` (Invariant 9 verifiable via ``sys.modules``).
Build-flag gating happens INSIDE the strategy constructor (so unit
tests that monkey-patch the env still hit the gate); this factory
performs the strategy-name module mapping only.
"""
from __future__ import annotations
from pathlib import Path
from typing import TYPE_CHECKING
from gps_denied_onboard.frame_source.errors import FrameSourceConfigError
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.frame_source import FrameSource
__all__ = ["build_frame_source"]
def build_frame_source(
*,
kind: str,
camera_calibration_id: str,
clock: "Clock",
device_index: int | None = None,
video_path: Path | str | None = None,
) -> "FrameSource":
"""Construct the :class:`FrameSource` strategy.
``kind`` is one of ``"live"`` or ``"video_file"``:
- ``"live"`` requires ``device_index`` (integer camera index) and
forbids ``video_path``.
- ``"video_file"`` requires ``video_path`` (filesystem path) and
forbids ``device_index``.
Build-flag gating is enforced by the strategy constructor; this
factory raises :class:`FrameSourceConfigError` ONLY on argument-
shape mistakes (missing or extra parameters for the chosen kind).
"""
if kind == "live":
if device_index is None:
raise FrameSourceConfigError(
"build_frame_source(kind='live'): device_index is required"
)
if video_path is not None:
raise FrameSourceConfigError(
"build_frame_source(kind='live'): video_path must be None"
)
from gps_denied_onboard.frame_source.live_camera import (
LiveCameraFrameSource,
)
return LiveCameraFrameSource(
device_index=device_index,
camera_calibration_id=camera_calibration_id,
clock=clock,
)
if kind == "video_file":
if video_path is None:
raise FrameSourceConfigError(
"build_frame_source(kind='video_file'): video_path is required"
)
if device_index is not None:
raise FrameSourceConfigError(
"build_frame_source(kind='video_file'): "
"device_index must be None"
)
from gps_denied_onboard.frame_source.video_file import (
VideoFileFrameSource,
)
return VideoFileFrameSource(
path=video_path,
camera_calibration_id=camera_calibration_id,
clock=clock,
)
raise FrameSourceConfigError(
f"build_frame_source: unknown kind {kind!r}; "
"expected 'live' or 'video_file'"
)
@@ -0,0 +1,196 @@
"""C3 matcher strategy composition-root factory (AZ-344).
:func:`build_matcher_strategy` selects exactly one strategy by
``config.components['c3_matcher'].strategy`` and respects
compile-time ``BUILD_MATCHER_<variant>`` gating: requesting a
strategy whose flag is OFF raises
:class:`StrategyNotAvailableError` at composition time (NOT at
first frame).
Concrete strategy modules
(``disk_lightglue``, ``aliked_lightglue``, ``xfeat``) are imported
lazily a Tier-0 workstation build with
``BUILD_MATCHER_DISK_LIGHTGLUE=OFF`` MUST NOT load
``c3_matcher.disk_lightglue`` (ADR-002 / I-5; verifiable via
``sys.modules``).
The shared :class:`LightGlueRuntime` and :class:`RansacFilter` are
constructor-injected the factory does NOT own their lifecycles.
The runtime root constructs ONE ``LightGlueRuntime`` instance and
passes the SAME reference to both this factory (C3) and the C2.5
``ReRankStrategy`` factory (per AC-10 / AZ-342 AC-10). The
identity-share gives R14 fix substance: a regression that
constructs two runtimes would double GPU memory.
The :class:`RollingHealthWindow` accumulator is constructed BY
this factory (one per matcher instance) and passed to the
concrete strategy's ``create`` entry-point so all backbones share
window semantics (AZ-344 Outcome line 5).
"""
from __future__ import annotations
import logging
import os
from typing import TYPE_CHECKING
from gps_denied_onboard.components.c3_matcher._health_window import RollingHealthWindow
from gps_denied_onboard.runtime_root.errors import StrategyNotAvailableError
if TYPE_CHECKING:
from gps_denied_onboard.components.c3_matcher import (
C3MatcherConfig,
CrossDomainMatcher,
)
from gps_denied_onboard.components.c7_inference import InferenceRuntime
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntime
from gps_denied_onboard.helpers.ransac_filter import RansacFilter
__all__ = ["build_matcher_strategy"]
_LOG = logging.getLogger("gps_denied_onboard.c3_matcher")
# Strategy resolution table — mirrors the contract's
# ``cross_domain_matcher_protocol.md`` v1.0.0 § Composition-Root
# Factory table verbatim. ANY mutation of this dict MUST be
# mirrored in the contract.
_STRATEGY_TO_BUILD_FLAG: dict[str, str] = {
"disk_lightglue": "BUILD_MATCHER_DISK_LIGHTGLUE",
"aliked_lightglue": "BUILD_MATCHER_ALIKED_LIGHTGLUE",
"xfeat": "BUILD_MATCHER_XFEAT",
}
_STRATEGY_TO_MODULE: dict[str, tuple[str, str]] = {
"disk_lightglue": (
"gps_denied_onboard.components.c3_matcher.disk_lightglue",
"DiskLightGlueMatcher",
),
"aliked_lightglue": (
"gps_denied_onboard.components.c3_matcher.aliked_lightglue",
"AlikedLightGlueMatcher",
),
"xfeat": (
"gps_denied_onboard.components.c3_matcher.xfeat",
"XFeatMatcher",
),
}
def _is_build_flag_on(flag_name: str) -> bool:
"""Read a compile-time ``BUILD_*`` flag from the environment.
``ON`` / ``1`` / ``true`` / ``yes`` (case-insensitive) ``True``;
anything else (including unset) ``False``. Defaults to OFF so
test environments must opt-in explicitly per strategy.
"""
raw = os.environ.get(flag_name, "")
return raw.strip().lower() in {"on", "1", "true", "yes"}
def _c3_config(config: "Config") -> "C3MatcherConfig":
"""Pull the registered C3 config block.
``c3_matcher.__init__`` registers it on import; a missing
registration is a developer error and surfaces as ``KeyError``
rather than a silent fallback.
"""
return config.components["c3_matcher"]
def build_matcher_strategy(
config: "Config",
*,
lightglue_runtime: "LightGlueRuntime",
ransac_filter: "RansacFilter",
inference_runtime: "InferenceRuntime",
) -> "CrossDomainMatcher":
"""Construct the :class:`CrossDomainMatcher` impl selected by config.
1. Reads ``config.components['c3_matcher'].strategy``.
2. Checks the matching ``BUILD_MATCHER_<variant>`` flag if
OFF, raises :class:`StrategyNotAvailableError` BEFORE any
import.
3. Constructs a :class:`RollingHealthWindow` seeded with
``config.components['c3_matcher'].min_inliers_threshold``.
4. Lazily imports the concrete strategy module.
5. Constructs the strategy via its module-level ``create(
config, lightglue_runtime, ransac_filter, inference_runtime,
health_window)`` factory function (each concrete strategy
module exports ``create`` as its public entry-point;
concrete constructors stay private).
6. Emits ONE INFO log ``kind="c3.matcher.strategy_loaded"``
with structured fields ``{strategy, min_inliers_threshold,
residual_warn_threshold_px}``.
Raises:
StrategyNotAvailableError: compile-time flag OFF or
concrete module not yet built
(AZ-345 / AZ-346 / AZ-347 pending).
"""
block = _c3_config(config)
strategy = block.strategy
flag_name = _STRATEGY_TO_BUILD_FLAG.get(strategy)
module_info = _STRATEGY_TO_MODULE.get(strategy)
if flag_name is None or module_info is None:
# Defensive — config validation rejects unknown strategy
# labels at load (``C3MatcherConfig.__post_init__``), so
# this branch is only reachable if the resolution table
# and the validation set drift apart.
_LOG.error(
"c3.matcher.build_flag_off",
extra={"strategy": strategy, "reason": "unknown_strategy"},
)
raise StrategyNotAvailableError(
f"CrossDomainMatcher {strategy!r} is not buildable in this binary."
)
if not _is_build_flag_on(flag_name):
_LOG.error(
"c3.matcher.build_flag_off",
extra={"strategy": strategy, "flag": flag_name},
)
raise StrategyNotAvailableError(
f"BUILD_MATCHER_{strategy.upper()} is OFF for this binary; "
f"cannot select strategy={strategy}."
)
health_window = RollingHealthWindow(
min_inliers_threshold=block.min_inliers_threshold,
)
module_name, class_name = module_info
try:
module = __import__(module_name, fromlist=[class_name])
except ModuleNotFoundError as exc:
raise StrategyNotAvailableError(
f"CrossDomainMatcher {strategy!r} is configured but its concrete "
f"impl module {module_name!r} has not been built into this binary "
"yet (AZ-345 / AZ-346 / AZ-347 pending)."
) from exc
create_fn = getattr(module, "create", None)
if create_fn is None:
strategy_cls = getattr(module, class_name)
instance = strategy_cls(
config,
lightglue_runtime=lightglue_runtime,
ransac_filter=ransac_filter,
inference_runtime=inference_runtime,
health_window=health_window,
)
else:
instance = create_fn(
config,
lightglue_runtime=lightglue_runtime,
ransac_filter=ransac_filter,
inference_runtime=inference_runtime,
health_window=health_window,
)
_LOG.info(
"c3.matcher.strategy_loaded",
extra={
"strategy": strategy,
"min_inliers_threshold": block.min_inliers_threshold,
"residual_warn_threshold_px": block.residual_warn_threshold_px,
},
)
return instance
@@ -0,0 +1,142 @@
"""C3.5 refiner strategy composition-root factory (AZ-348).
:func:`build_refiner_strategy` selects exactly one strategy by
``config.components['c3_5_adhop'].strategy``. Both concrete
strategies are linked into the production binary
**unconditionally** (NO ``BUILD_REFINER_*`` flag this is NOT
ADR-002 territory). Runtime selection only per ADR-001.
Strategy resolution table mirrors the contract's
``conditional_refiner_protocol.md`` v1.0.0 § Composition-root
factory table verbatim:
* ``"adhop"`` ``gps_denied_onboard.components.c3_5_adhop.adhop_refiner.AdHoPRefiner`` (AZ-349; placeholder today).
* ``"passthrough"`` ``gps_denied_onboard.components.c3_5_adhop.passthrough_refiner.PassthroughRefiner``.
The shared :class:`RansacFilter` and C7 :class:`InferenceRuntime`
handles are constructor-injected the factory does NOT own their
lifecycles. The runtime root constructs ONE
:class:`RansacFilter` instance and identity-shares it across C3,
C3.5, and C4 (per ``ransac_filter.md`` v1.0.0).
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING
from gps_denied_onboard.components.c3_5_adhop.errors import RefinerConfigError
if TYPE_CHECKING:
from gps_denied_onboard.components.c3_5_adhop import (
C3_5RefinerConfig,
ConditionalRefiner,
)
from gps_denied_onboard.components.c7_inference import InferenceRuntime
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.helpers.ransac_filter import RansacFilter
__all__ = ["build_refiner_strategy"]
_LOG = logging.getLogger("gps_denied_onboard.c3_5_adhop")
# Strategy resolution table — mirrors the contract verbatim. ANY
# mutation of this dict MUST be mirrored in the contract.
_STRATEGY_TO_MODULE: dict[str, tuple[str, str]] = {
"adhop": (
"gps_denied_onboard.components.c3_5_adhop.adhop_refiner",
"AdHoPRefiner",
),
"passthrough": (
"gps_denied_onboard.components.c3_5_adhop.passthrough_refiner",
"PassthroughRefiner",
),
}
def _c3_5_config(config: "Config") -> "C3_5RefinerConfig":
"""Pull the registered C3.5 config block.
``c3_5_adhop.__init__`` registers it on import; a missing
registration is a developer error and surfaces as ``KeyError``
rather than a silent fallback.
"""
return config.components["c3_5_adhop"]
def build_refiner_strategy(
config: "Config",
*,
ransac_filter: "RansacFilter",
inference_runtime: "InferenceRuntime",
) -> "ConditionalRefiner":
"""Construct the :class:`ConditionalRefiner` impl selected by config.
1. Reads ``config.components['c3_5_adhop'].{strategy, residual_threshold_px}``.
2. Validates ``residual_threshold_px > 0`` defensive
redundancy on top of the config-load-time check
(:class:`C3_5RefinerConfig.__post_init__`); raises
:class:`RefinerConfigError` on failure.
3. Imports the concrete strategy module via the resolution
table (NOT lazy both strategies are linked
unconditionally).
4. Constructs the strategy via its module-level
``create(config, ransac_filter, inference_runtime)``
factory function.
5. Emits ONE INFO log ``kind="c3_5.refiner.strategy_loaded"``
with ``{strategy, residual_threshold_px}``.
Raises:
RefinerConfigError: unknown strategy label OR invalid
threshold (``<= 0``).
"""
block = _c3_5_config(config)
strategy = block.strategy
module_info = _STRATEGY_TO_MODULE.get(strategy)
if module_info is None:
_LOG.error(
"c3_5.refiner.strategy_unknown",
extra={"strategy": strategy},
)
raise RefinerConfigError(f"Unknown refiner strategy: {strategy}")
if block.residual_threshold_px <= 0.0:
# Config-load validation should have rejected this already;
# defensive in case a caller constructed Config bypassing
# __post_init__ (e.g., via dataclasses.replace on a partial
# block).
_LOG.error(
"c3_5.refiner.invalid_threshold",
extra={
"strategy": strategy,
"residual_threshold_px": block.residual_threshold_px,
},
)
raise RefinerConfigError(
"residual_threshold_px must be > 0; "
f"got {block.residual_threshold_px}"
)
module_name, class_name = module_info
module = __import__(module_name, fromlist=[class_name])
create_fn = getattr(module, "create", None)
if create_fn is None:
strategy_cls = getattr(module, class_name)
instance = strategy_cls(
ransac_filter=ransac_filter,
inference_runtime=inference_runtime,
)
else:
instance = create_fn(
config,
ransac_filter=ransac_filter,
inference_runtime=inference_runtime,
)
_LOG.info(
"c3_5.refiner.strategy_loaded",
extra={
"strategy": strategy,
"residual_threshold_px": block.residual_threshold_px,
},
)
return instance
@@ -0,0 +1,166 @@
"""C2.5 ReRank strategy composition-root factory (AZ-342).
:func:`build_rerank_strategy` selects exactly one strategy by
``config.components['c2_5_rerank'].strategy`` and respects
compile-time ``BUILD_RERANK_<variant>`` gating: requesting a
strategy whose flag is OFF raises
:class:`StrategyNotAvailableError` at composition time (NOT at
first frame).
The shared :class:`LightGlueRuntime` is constructor-injected the
factory does NOT own its lifecycle. The runtime root constructs ONE
``LightGlueRuntime`` instance and passes the same reference to both
this factory (C2.5) and the future C3 matcher factory (R14 fix; see
``description.md`` § 6).
Concrete strategy modules are imported lazily a Tier-0 workstation
build with ``BUILD_RERANK_INLIER_COUNT=OFF`` MUST NOT load
``c2_5_rerank.inlier_based_reranker`` (ADR-002 / I-5; verifiable via
``sys.modules``).
"""
from __future__ import annotations
import logging
import os
from typing import TYPE_CHECKING
from gps_denied_onboard.runtime_root.errors import StrategyNotAvailableError
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c2_5_rerank import (
C2_5RerankConfig,
ReRankStrategy,
)
from gps_denied_onboard.components.c6_tile_cache import TileStore
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrClient
from gps_denied_onboard.helpers.feature_extractor import FeatureExtractor
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntime
__all__ = ["build_rerank_strategy"]
_LOG = logging.getLogger("gps_denied_onboard.c2_5_rerank")
# Strategy resolution table — mirrors the contract's
# ``rerank_strategy_protocol.md`` v1.0.0 § Composition-Root Factory
# table verbatim. ANY mutation here MUST be mirrored in the contract.
_STRATEGY_TO_BUILD_FLAG: dict[str, str] = {
"inlier_count": "BUILD_RERANK_INLIER_COUNT",
}
_STRATEGY_TO_MODULE: dict[str, tuple[str, str]] = {
"inlier_count": (
"gps_denied_onboard.components.c2_5_rerank.inlier_based_reranker",
"InlierCountReRanker",
),
}
def _is_build_flag_on(flag_name: str) -> bool:
raw = os.environ.get(flag_name, "")
return raw.strip().lower() in {"on", "1", "true", "yes"}
def _c2_5_config(config: "Config") -> "C2_5RerankConfig":
return config.components["c2_5_rerank"]
def build_rerank_strategy(
config: "Config",
*,
tile_store: "TileStore",
lightglue_runtime: "LightGlueRuntime",
feature_extractor: "FeatureExtractor",
clock: "Clock",
fdr_client: "FdrClient | None" = None,
) -> "ReRankStrategy":
"""Construct the :class:`ReRankStrategy` impl selected by config.
1. Reads ``config.components['c2_5_rerank'].strategy``.
2. Checks the matching ``BUILD_RERANK_<variant>`` flag if OFF,
raises :class:`StrategyNotAvailableError` BEFORE any import.
3. Lazily imports the concrete strategy module.
4. Constructs the strategy via its module-level
``create(config, tile_store, lightglue_runtime, feature_extractor, fdr_client)``
factory function (each concrete strategy module exports
``create`` as its public entry-point; concrete constructors
stay private).
5. Emits ONE INFO log ``kind="c2_5.rerank.strategy_loaded"`` with
structured fields ``{strategy, top_n}``.
``feature_extractor`` is a shared L1 helper (AZ-343 scope
expansion) used by the concrete strategy to extract keypoints +
descriptors from each per-frame nav image AND from each
candidate's tile pixels. ``clock`` is the composition-root
:class:`Clock` (AZ-398) strategies stamp
:attr:`RerankResult.reranked_at` via ``clock.monotonic_ns()``
rather than calling stdlib ``time`` directly (Invariant 2 of
the replay contract). ``fdr_client`` is optional passed
through to strategies that emit FDR records; ``None`` lets the
strategy run without FDR emission (useful for tests).
Raises:
StrategyNotAvailableError: compile-time flag OFF or
concrete module not yet built.
"""
block = _c2_5_config(config)
strategy = block.strategy
flag_name = _STRATEGY_TO_BUILD_FLAG.get(strategy)
module_info = _STRATEGY_TO_MODULE.get(strategy)
if flag_name is None or module_info is None:
# Defensive — config validation rejects unknown strategy labels
# at load (C2_5RerankConfig.__post_init__).
_LOG.error(
"c2_5.rerank.build_flag_off",
extra={"strategy": strategy, "reason": "unknown_strategy"},
)
raise StrategyNotAvailableError(
f"ReRankStrategy {strategy!r} is not buildable in this binary."
)
if not _is_build_flag_on(flag_name):
_LOG.error(
"c2_5.rerank.build_flag_off",
extra={"strategy": strategy, "flag": flag_name},
)
raise StrategyNotAvailableError(
f"BUILD_RERANK_{strategy.upper()} is OFF for this binary; "
f"cannot select strategy={strategy}."
)
module_name, class_name = module_info
try:
module = __import__(module_name, fromlist=[class_name])
except ModuleNotFoundError as exc:
raise StrategyNotAvailableError(
f"ReRankStrategy {strategy!r} is configured but its concrete impl "
f"module {module_name!r} has not been built into this binary "
"yet (AZ-343 pending)."
) from exc
create_fn = getattr(module, "create", None)
if create_fn is None:
strategy_cls = getattr(module, class_name)
instance = strategy_cls(
config,
tile_store=tile_store,
lightglue_runtime=lightglue_runtime,
feature_extractor=feature_extractor,
clock=clock,
fdr_client=fdr_client,
)
else:
instance = create_fn(
config,
tile_store=tile_store,
lightglue_runtime=lightglue_runtime,
feature_extractor=feature_extractor,
clock=clock,
fdr_client=fdr_client,
)
_LOG.info(
"c2_5.rerank.strategy_loaded",
extra={"strategy": strategy, "top_n": block.top_n},
)
return instance
@@ -0,0 +1,114 @@
"""C1 VIO strategy composition-root factory (AZ-331).
:func:`build_vio_strategy` selects exactly one strategy by
``config.components['c1_vio'].strategy`` and respects compile-time
``BUILD_*`` gating: requesting a strategy whose flag is OFF raises
:class:`StrategyNotAvailableError` at composition time (NOT at first
frame).
Concrete strategy modules (``okvis2``, ``vins_mono``, ``klt_ransac``)
are imported lazily a Tier-0 workstation build with
``BUILD_OKVIS2=OFF`` MUST NOT load ``c1_vio.okvis2`` (Risk-2 / I-5;
verifiable via ``sys.modules``).
"""
from __future__ import annotations
import os
from typing import TYPE_CHECKING
from gps_denied_onboard.runtime_root.errors import StrategyNotAvailableError
if TYPE_CHECKING:
from gps_denied_onboard.components.c1_vio import (
C1VioConfig,
VioStrategy,
)
from gps_denied_onboard.components.c13_fdr import FdrClient
from gps_denied_onboard.config.schema import Config
__all__ = ["build_vio_strategy"]
_STRATEGY_TO_BUILD_FLAG: dict[str, str] = {
"okvis2": "BUILD_OKVIS2",
"vins_mono": "BUILD_VINS_MONO",
"klt_ransac": "BUILD_KLT_RANSAC",
}
_STRATEGY_TO_MODULE: dict[str, tuple[str, str]] = {
"okvis2": ("gps_denied_onboard.components.c1_vio.okvis2", "Okvis2Strategy"),
"vins_mono": (
"gps_denied_onboard.components.c1_vio.vins_mono",
"VinsMonoStrategy",
),
"klt_ransac": (
"gps_denied_onboard.components.c1_vio.klt_ransac",
"KltRansacStrategy",
),
}
def _is_build_flag_on(flag_name: str) -> bool:
"""Read a compile-time ``BUILD_*`` flag from the environment.
``ON`` / ``1`` / ``true`` / ``yes`` (case-insensitive) ``True``;
anything else (including unset) ``False``. Defaults to OFF so
test environments must opt-in explicitly per strategy.
"""
raw = os.environ.get(flag_name, "")
return raw.strip().lower() in {"on", "1", "true", "yes"}
def _c1_config(config: "Config") -> "C1VioConfig":
"""Pull the registered C1 config block.
``c1_vio.__init__`` registers it on import; a missing
registration is a developer error and surfaces as ``KeyError``
rather than a silent fallback.
"""
return config.components["c1_vio"]
def build_vio_strategy(
config: "Config",
*,
fdr_client: "FdrClient",
) -> "VioStrategy":
"""Construct the :class:`VioStrategy` impl selected by config.
1. Reads ``config.components['c1_vio'].strategy``.
2. Checks the matching ``BUILD_*`` flag if OFF, raises
:class:`StrategyNotAvailableError` BEFORE any import.
3. Lazily imports the concrete strategy module.
4. Constructs and returns the strategy instance, passing
``config`` and ``fdr_client``.
Raises :class:`StrategyNotAvailableError` when the compile-time
flag is OFF (canonical Tier-0 path) or when the concrete strategy
module has not been built yet (AZ-332 / AZ-333 / AZ-334 pending).
"""
block = _c1_config(config)
strategy = block.strategy
flag_name = _STRATEGY_TO_BUILD_FLAG.get(strategy)
module_info = _STRATEGY_TO_MODULE.get(strategy)
if flag_name is None or module_info is None:
raise StrategyNotAvailableError(
f"VioStrategy {strategy!r} is not buildable in this binary."
)
if not _is_build_flag_on(flag_name):
raise StrategyNotAvailableError(
f"VioStrategy {strategy!r} requires {flag_name}=ON in this "
"binary; the flag is OFF."
)
module_name, class_name = module_info
try:
module = __import__(module_name, fromlist=[class_name])
except ModuleNotFoundError as exc:
raise StrategyNotAvailableError(
f"VioStrategy {strategy!r} is configured but its concrete impl "
f"module {module_name!r} has not been built into this binary "
"yet (AZ-332 / AZ-333 / AZ-334 pending)."
) from exc
strategy_cls = getattr(module, class_name)
return strategy_cls(config, fdr_client=fdr_client)
@@ -0,0 +1,214 @@
"""C2 VPR strategy composition-root factory (AZ-336).
:func:`build_vpr_strategy` selects exactly one strategy by
``config.components['c2_vpr'].strategy`` and respects compile-time
``BUILD_VPR_<variant>`` gating: requesting a strategy whose flag is
OFF raises :class:`StrategyNotAvailableError` at composition time
(NOT at first frame).
Concrete strategy modules
(``ultra_vpr``, ``net_vlad``, ``mega_loc``, ``mix_vpr``,
``sela_vpr``, ``eigen_places``, ``salad``) are imported lazily
a Tier-0 workstation build with ``BUILD_VPR_ULTRA_VPR=OFF`` MUST
NOT load ``c2_vpr.ultra_vpr`` (ADR-002 / I-5; verifiable via
``sys.modules``).
Pre-flight validation: after constructing the strategy, the factory
queries :meth:`VprStrategy.descriptor_dim` and asserts it matches
the C6 ``DescriptorIndex`` sidecar's ``descriptor_dim()``. Mismatch
:class:`ConfigurationError` at startup, NOT at first frame.
Factory signature deviates from the v1.0.0 contract draft in one
place: the contract spec named the second parameter ``tile_store:
TileStore``, but ``descriptor_dim()`` lives on
:class:`DescriptorIndex` per C6's actual Public API. We inject
``descriptor_index`` directly; the contract markdown is updated to
match.
"""
from __future__ import annotations
import logging
import os
from typing import TYPE_CHECKING
from gps_denied_onboard.config.schema import ConfigError
from gps_denied_onboard.runtime_root.errors import StrategyNotAvailableError
if TYPE_CHECKING:
from gps_denied_onboard.components.c2_vpr import C2VprConfig, VprStrategy
from gps_denied_onboard.components.c6_tile_cache import DescriptorIndex
from gps_denied_onboard.components.c7_inference import InferenceRuntime
from gps_denied_onboard.config.schema import Config
__all__ = ["build_vpr_strategy"]
_LOG = logging.getLogger("gps_denied_onboard.c2_vpr")
# Strategy resolution table — mirrors the contract's
# ``vpr_strategy_protocol.md`` v1.0.0 § Composition-Root Factory table
# verbatim. ANY mutation of this dict MUST be mirrored in the contract.
_STRATEGY_TO_BUILD_FLAG: dict[str, str] = {
"ultra_vpr": "BUILD_VPR_ULTRA_VPR",
"net_vlad": "BUILD_VPR_NETVLAD",
"mega_loc": "BUILD_VPR_MEGALOC",
"mix_vpr": "BUILD_VPR_MIXVPR",
"sela_vpr": "BUILD_VPR_SELAVPR",
"eigen_places": "BUILD_VPR_EIGENPLACES",
"salad": "BUILD_VPR_SALAD",
}
_STRATEGY_TO_MODULE: dict[str, tuple[str, str]] = {
"ultra_vpr": (
"gps_denied_onboard.components.c2_vpr.ultra_vpr",
"UltraVprStrategy",
),
"net_vlad": (
"gps_denied_onboard.components.c2_vpr.net_vlad",
"NetVladStrategy",
),
"mega_loc": (
"gps_denied_onboard.components.c2_vpr.mega_loc",
"MegaLocStrategy",
),
"mix_vpr": (
"gps_denied_onboard.components.c2_vpr.mix_vpr",
"MixVprStrategy",
),
"sela_vpr": (
"gps_denied_onboard.components.c2_vpr.sela_vpr",
"SelaVprStrategy",
),
"eigen_places": (
"gps_denied_onboard.components.c2_vpr.eigen_places",
"EigenPlacesStrategy",
),
"salad": (
"gps_denied_onboard.components.c2_vpr.salad",
"SaladStrategy",
),
}
def _is_build_flag_on(flag_name: str) -> bool:
"""Read a compile-time ``BUILD_*`` flag from the environment.
``ON`` / ``1`` / ``true`` / ``yes`` (case-insensitive) ``True``;
anything else (including unset) ``False``. Defaults to OFF so
test environments must opt-in explicitly per strategy.
"""
raw = os.environ.get(flag_name, "")
return raw.strip().lower() in {"on", "1", "true", "yes"}
def _c2_config(config: "Config") -> "C2VprConfig":
"""Pull the registered C2 config block.
``c2_vpr.__init__`` registers it on import; a missing
registration is a developer error and surfaces as ``KeyError``
rather than a silent fallback.
"""
return config.components["c2_vpr"]
def build_vpr_strategy(
config: "Config",
*,
descriptor_index: "DescriptorIndex",
inference_runtime: "InferenceRuntime",
) -> "VprStrategy":
"""Construct the :class:`VprStrategy` impl selected by config.
1. Reads ``config.components['c2_vpr'].strategy``.
2. Checks the matching ``BUILD_VPR_<variant>`` flag if OFF,
raises :class:`StrategyNotAvailableError` BEFORE any import.
3. Lazily imports the concrete strategy module.
4. Constructs the strategy via its module-level
``create(config, descriptor_index, inference_runtime)``
factory function (each concrete strategy module exports
``create`` as its public entry-point; concrete constructors
stay private).
5. Pre-flight ``descriptor_dim`` match: ``strategy.descriptor_dim()``
vs ``descriptor_index.descriptor_dim()``. Mismatch raises
:class:`ConfigError`; ONE ERROR log
``kind="c2.vpr.dim_mismatch"`` is emitted; the strategy is
NOT bound.
6. On success, ONE INFO log ``kind="c2.vpr.strategy_loaded"``
with ``strategy`` + ``descriptor_dim``.
Raises:
StrategyNotAvailableError: compile-time flag OFF or
concrete module not yet built (AZ-337..AZ-340 pending).
ConfigError: ``descriptor_dim`` mismatch between strategy
and corpus index.
"""
block = _c2_config(config)
strategy = block.strategy
flag_name = _STRATEGY_TO_BUILD_FLAG.get(strategy)
module_info = _STRATEGY_TO_MODULE.get(strategy)
if flag_name is None or module_info is None:
# Defensive — config validation rejects unknown strategy labels
# at load (C2VprConfig.__post_init__), so this branch is only
# reachable if the resolution table and the validation set
# drift apart.
_LOG.error(
"c2.vpr.build_flag_off",
extra={"strategy": strategy, "reason": "unknown_strategy"},
)
raise StrategyNotAvailableError(
f"VprStrategy {strategy!r} is not buildable in this binary."
)
if not _is_build_flag_on(flag_name):
_LOG.error(
"c2.vpr.build_flag_off",
extra={"strategy": strategy, "flag": flag_name},
)
raise StrategyNotAvailableError(
f"BUILD_VPR_{strategy.upper()} is OFF for this binary; "
f"cannot select strategy={strategy}."
)
module_name, class_name = module_info
try:
module = __import__(module_name, fromlist=[class_name])
except ModuleNotFoundError as exc:
raise StrategyNotAvailableError(
f"VprStrategy {strategy!r} is configured but its concrete impl "
f"module {module_name!r} has not been built into this binary "
"yet (AZ-337 / AZ-338 / AZ-339 / AZ-340 pending)."
) from exc
create_fn = getattr(module, "create", None)
if create_fn is None:
strategy_cls = getattr(module, class_name)
instance = strategy_cls(
config,
descriptor_index=descriptor_index,
inference_runtime=inference_runtime,
)
else:
instance = create_fn(
config,
descriptor_index=descriptor_index,
inference_runtime=inference_runtime,
)
strategy_dim = instance.descriptor_dim()
corpus_dim = descriptor_index.descriptor_dim()
if strategy_dim != corpus_dim:
_LOG.error(
"c2.vpr.dim_mismatch",
extra={
"strategy": strategy,
"strategy_dim": strategy_dim,
"corpus_dim": corpus_dim,
},
)
raise ConfigError(
f"descriptor_dim mismatch: strategy={strategy_dim}, "
f"corpus={corpus_dim}"
)
_LOG.info(
"c2.vpr.strategy_loaded",
extra={"strategy": strategy, "descriptor_dim": strategy_dim},
)
return instance
View File
@@ -0,0 +1,105 @@
"""AC-4 — components MUST NOT call ``time.monotonic_ns`` / ``time.time_ns`` / ``time.sleep``.
Enforces Invariant 2 of the replay contract
(``_docs/02_document/contracts/replay/replay_protocol.md``): every
time-driven code path in a C* component consumes an injected
:class:`Clock` instead. Replay determinism (R-DEMO-4) collapses the
moment a component reaches into the stdlib ``time`` module directly,
so this guard runs on every PR touching ``src/gps_denied_onboard/components/``.
The scan is AST-based docstrings and comments mentioning the forbidden
APIs do NOT trip it; only call sites and attribute references do.
"""
from __future__ import annotations
import ast
from pathlib import Path
import pytest
_FORBIDDEN_ATTRS: frozenset[str] = frozenset(
{"monotonic_ns", "time_ns", "sleep"}
)
_COMPONENTS_ROOT: Path = (
Path(__file__).parent.parent.parent
/ "src"
/ "gps_denied_onboard"
/ "components"
)
def _python_files_under(root: Path) -> list[Path]:
return sorted(p for p in root.rglob("*.py") if p.is_file())
def _find_direct_time_references(source: str) -> list[tuple[int, str]]:
"""Return ``(lineno, attribute_name)`` for every direct ``time.X`` ref.
Only flags ``ast.Attribute(value=ast.Name(id='time'), attr=<x>)``
where ``<x>`` is one of the forbidden names. Aliased imports
(``import time as t`` ``t.monotonic_ns()``) are intentionally NOT
caught the component code convention is to avoid such aliases, and
catching them would require flow-sensitive analysis.
"""
hits: list[tuple[int, str]] = []
tree = ast.parse(source)
for node in ast.walk(tree):
if not isinstance(node, ast.Attribute):
continue
if not isinstance(node.value, ast.Name):
continue
if node.value.id != "time":
continue
if node.attr in _FORBIDDEN_ATTRS:
hits.append((node.lineno, f"time.{node.attr}"))
return hits
def test_components_have_no_direct_time_references() -> None:
# Arrange
files = _python_files_under(_COMPONENTS_ROOT)
assert files, f"AST scan found no .py files under {_COMPONENTS_ROOT}"
offences: list[str] = []
# Act
for file in files:
source = file.read_text(encoding="utf-8")
for lineno, attr in _find_direct_time_references(source):
rel = file.relative_to(_COMPONENTS_ROOT.parent.parent.parent.parent)
offences.append(f"{rel}:{lineno}{attr}")
# Assert
assert not offences, (
"Invariant 2 violation: direct stdlib-`time` references found in "
"`src/gps_denied_onboard/components/**/*.py`. Consume an injected "
"`Clock` (`gps_denied_onboard.clock`) instead.\n"
+ "\n".join(offences)
)
def test_scan_helper_detects_known_forbidden_pattern() -> None:
# Arrange — self-check the AST helper so a stale scan can't silently pass.
source = "import time\ndef f() -> int:\n return time.monotonic_ns()\n"
# Act
hits = _find_direct_time_references(source)
# Assert
assert hits == [(3, "time.monotonic_ns")]
def test_scan_helper_ignores_docstring_mentions() -> None:
# Arrange — docstrings naming the forbidden API must not trip the scan.
source = '"""This module talks about time.monotonic_ns in prose only."""\n'
# Act
hits = _find_direct_time_references(source)
# Assert
assert hits == []
@pytest.mark.parametrize("forbidden", sorted(_FORBIDDEN_ATTRS))
def test_scan_helper_detects_each_forbidden_attr(forbidden: str) -> None:
# Arrange
source = f"import time\ntime.{forbidden}()\n"
# Act
hits = _find_direct_time_references(source)
# Assert
assert hits == [(2, f"time.{forbidden}")]
+187
View File
@@ -0,0 +1,187 @@
"""Shared fixtures for ``tests/unit/c1_vio/`` (AZ-332).
Provides a scriptable fake ``okvis2_binding`` module installed at the
``sys.modules`` boundary BEFORE the strategy's lazy import inside the
constructor runs. The fake mirrors the real binding's surface
(``Okvis2Backend`` class + 3 exception types) so :class:`Okvis2Strategy`
can be exercised on macOS dev + GitHub Actions Linux runner without
the real OKVIS2 / pybind11 native lib.
The task spec explicitly permits this for AC-3, AC-6, AC-7 backend-
exception injection (and by extension the rest of the AC suite that
exercises the Python facade only).
"""
from __future__ import annotations
import sys
from collections import deque
from collections.abc import Iterator
from dataclasses import dataclass, field
from typing import Any, Final
import numpy as np
import pytest
_BINDING_MODULE_NAME: Final[str] = "gps_denied_onboard.components.c1_vio._native.okvis2_binding"
_STRATEGY_MODULE_NAME: Final[str] = "gps_denied_onboard.components.c1_vio.okvis2"
# ---------------------------------------------------------------------------
# Fake exception types — Python classes mirroring the C++ side.
class FakeOkvisInitException(Exception):
pass
class FakeOkvisFatalException(Exception):
pass
class FakeOkvisOptimizationException(Exception):
pass
# ---------------------------------------------------------------------------
# Scripted output payload — what the fake backend pops on each add_frame.
@dataclass
class ScriptedOutput:
"""A single scripted backend response.
``produced`` mirrors the real binding's ``add_frame`` return: True
means a new estimator output is available via
:meth:`Okvis2Backend.get_latest_output`. ``raise_with`` (if not
None) is raised from ``add_frame`` instead of producing an output.
"""
produced: bool = True
raise_with: Exception | None = None
payload: dict[str, Any] = field(default_factory=dict)
def _make_default_payload(frame_id: str = "frame-0001") -> dict[str, Any]:
"""A 'tracking' payload — SPD covariance, tracked > threshold."""
return {
"frame_id": frame_id,
"pose_T_world_body": np.eye(4, dtype=np.float64),
"pose_covariance_6x6": np.eye(6, dtype=np.float64) * 0.01,
"accel_bias": np.zeros(3, dtype=np.float64),
"gyro_bias": np.zeros(3, dtype=np.float64),
"tracked_features": 80,
"new_features": 3,
"lost_features": 1,
"mean_parallax": 5.0,
"mre_px": 0.8,
"emitted_at_ns": 1_000_000_000,
}
# ---------------------------------------------------------------------------
# Scriptable fake Okvis2Backend.
class FakeOkvis2Backend:
def __init__(
self,
yaml_config: str,
camera_intrinsics_3x3: np.ndarray,
) -> None:
self.yaml_config = yaml_config
self.camera_intrinsics_3x3 = np.asarray(camera_intrinsics_3x3, dtype=np.float64)
self._scripted: deque[ScriptedOutput] = deque()
self._latest: dict[str, Any] | None = None
self._frames_seen: list[tuple[str, int]] = []
self._imu_samples: list[tuple[int, np.ndarray, np.ndarray]] = []
self._reset_calls: int = 0
self._health: dict[str, Any] = {
"state": "init",
"consecutive_lost": 0,
"bias_norm": 0.0,
}
# Test-only API — caller scripts the queue of responses.
def script(self, *outputs: ScriptedOutput) -> None:
self._scripted.extend(outputs)
# ---- Real surface mirrored 1:1 with the C++ binding. ----
def add_frame(self, frame_id: str, ts_ns: int, image: np.ndarray) -> bool:
self._frames_seen.append((frame_id, ts_ns))
if not self._scripted:
self._latest = _make_default_payload(frame_id)
return True
head = self._scripted.popleft()
if head.raise_with is not None:
raise head.raise_with
if head.produced:
payload = dict(_make_default_payload(frame_id))
payload.update(head.payload)
payload["frame_id"] = frame_id
self._latest = payload
return head.produced
def add_imu(self, ts_ns: int, accel: np.ndarray, gyro: np.ndarray) -> None:
self._imu_samples.append((ts_ns, np.asarray(accel), np.asarray(gyro)))
def get_latest_output(self) -> dict[str, Any] | None:
return self._latest
def reset(
self,
body_T_world: np.ndarray,
velocity: np.ndarray,
accel_bias: np.ndarray,
gyro_bias: np.ndarray,
) -> None:
self._reset_calls += 1
self._latest = None
self._health["state"] = "init"
self._health["consecutive_lost"] = 0
def health(self) -> dict[str, Any]:
return dict(self._health)
# ---- Test introspection helpers (NOT part of the real binding). ----
@property
def frames_seen(self) -> list[tuple[str, int]]:
return list(self._frames_seen)
@property
def reset_call_count(self) -> int:
return self._reset_calls
# ---------------------------------------------------------------------------
# Module fixture — installs fake `_native.okvis2_binding` at sys.modules.
@pytest.fixture
def fake_okvis2_binding(
monkeypatch: pytest.MonkeyPatch,
) -> Iterator[type[FakeOkvis2Backend]]:
"""Install a fake ``okvis2_binding`` module at the import boundary.
Cleans up both the binding module and the strategy module so each
test starts with a fresh lazy-import state.
"""
import types
fake_module = types.ModuleType(_BINDING_MODULE_NAME)
fake_module.Okvis2Backend = FakeOkvis2Backend # type: ignore[attr-defined]
fake_module.OkvisInitException = FakeOkvisInitException # type: ignore[attr-defined]
fake_module.OkvisFatalException = FakeOkvisFatalException # type: ignore[attr-defined]
fake_module.OkvisOptimizationException = ( # type: ignore[attr-defined]
FakeOkvisOptimizationException
)
sys.modules.pop(_BINDING_MODULE_NAME, None)
sys.modules.pop(_STRATEGY_MODULE_NAME, None)
monkeypatch.setitem(sys.modules, _BINDING_MODULE_NAME, fake_module)
yield FakeOkvis2Backend
sys.modules.pop(_STRATEGY_MODULE_NAME, None)
+545
View File
@@ -0,0 +1,545 @@
"""AZ-332 — :class:`Okvis2Strategy` acceptance criteria coverage.
Covers AC-1 through AC-10 (with AC-9 + NFR-perf tagged
``@pytest.mark.tier2`` per the carry-over plan; skipped on macOS dev
+ GitHub Actions Linux runner; run on Jetson via ``ci-tier2.yml``).
Uses the ``fake_okvis2_binding`` fixture from ``conftest.py`` to
script backend responses the task spec explicitly permits a fake
binding for backend-exception injection (AC-3 / AC-6 / AC-7) and by
extension the rest of the Python-facade-only AC suite.
"""
from __future__ import annotations
from datetime import datetime, timezone
import gtsam
import numpy as np
import pytest
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.nav import (
ImuBias,
ImuSample,
ImuWindow,
NavCameraFrame,
VioOutput,
VioState,
WarmStartPose,
)
from gps_denied_onboard.components.c1_vio import (
C1VioConfig,
Okvis2Config,
VioError,
VioFatalError,
VioInitializingError,
)
from gps_denied_onboard.config.schema import Config, RuntimeConfig
from gps_denied_onboard.fdr_client.client import FdrClient
from gps_denied_onboard.fdr_client.records import FdrRecord
from tests.unit.c1_vio.conftest import (
FakeOkvis2Backend,
FakeOkvisFatalException,
FakeOkvisInitException,
FakeOkvisOptimizationException,
ScriptedOutput,
)
# ---------------------------------------------------------------------------
# Helpers.
def _zero_bias() -> ImuBias:
return ImuBias(accel_bias=(0.0, 0.0, 0.0), gyro_bias=(0.0, 0.0, 0.0))
def _calibration() -> CameraCalibration:
return CameraCalibration(
camera_id="test-cam",
intrinsics_3x3=np.eye(3, dtype=np.float64),
distortion=np.zeros(4, dtype=np.float64),
body_to_camera_se3=np.eye(4, dtype=np.float64),
acquisition_method="unit-test-static",
metadata={},
)
def _frame(idx: int = 1, ts_ns: int = 1_000_000_000) -> NavCameraFrame:
return NavCameraFrame(
frame_id=idx,
timestamp=datetime.fromtimestamp(ts_ns * 1e-9, tz=timezone.utc),
image=np.zeros((4, 4, 3), dtype=np.uint8),
camera_calibration_id="test-cam",
)
def _imu_window(ts_ns_start: int = 999_000_000, n: int = 3) -> ImuWindow:
samples = tuple(
ImuSample(
ts_ns=ts_ns_start + i * 5_000_000,
accel_xyz=(0.0, 0.0, 9.81),
gyro_xyz=(0.0, 0.0, 0.0),
)
for i in range(n)
)
return ImuWindow(
samples=samples,
ts_start_ns=samples[0].ts_ns,
ts_end_ns=samples[-1].ts_ns,
)
def _warm_start_hint() -> WarmStartPose:
return WarmStartPose(
body_T_world=gtsam.Pose3(np.eye(4)),
velocity_b=(0.5, 0.0, 0.0),
bias=ImuBias(
accel_bias=(0.01, -0.02, 0.0),
gyro_bias=(0.003, 0.0, -0.001),
),
captured_at_ns=1_000_000_000,
)
def _config(
okvis2_cfg: Okvis2Config | None = None,
lost_frame_threshold: int = 9,
warm_start_max_frames: int = 5,
) -> Config:
return Config.with_blocks(
c1_vio=C1VioConfig(
strategy="okvis2",
lost_frame_threshold=lost_frame_threshold,
warm_start_max_frames=warm_start_max_frames,
okvis2=okvis2_cfg or Okvis2Config(),
),
runtime=RuntimeConfig(camera_calibration_path=""),
)
@pytest.fixture
def fdr_client() -> FdrClient:
return FdrClient(producer_id="c1_vio.okvis2", capacity=256, _emit_diag_log=False)
def _build_strategy(
fdr_client: FdrClient,
config: Config | None = None,
):
"""Lazy import after the fake binding is installed in sys.modules."""
from gps_denied_onboard.components.c1_vio.okvis2 import Okvis2Strategy
return Okvis2Strategy(config or _config(), fdr_client=fdr_client)
def _drain(fdr_client: FdrClient) -> list[FdrRecord]:
return fdr_client.drain(max_records=1024)
# ===========================================================================
# AC-1: current_strategy_label returns "okvis2".
def test_ac1_current_strategy_label_returns_okvis2(fake_okvis2_binding, fdr_client) -> None:
strategy = _build_strategy(fdr_client)
assert strategy.current_strategy_label() == "okvis2"
# ===========================================================================
# AC-2: process_frame returns VioOutput with echoed frame_id, SPD cov, bias.
def test_ac2_process_frame_returns_vio_output_with_frame_id(
fake_okvis2_binding, fdr_client
) -> None:
config = _config(warm_start_max_frames=1)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
backend.script(ScriptedOutput(produced=True))
out = strategy.process_frame(_frame(idx=42), _imu_window(), _calibration())
assert isinstance(out, VioOutput)
assert out.frame_id == "42"
assert out.pose_covariance_6x6.shape == (6, 6)
assert np.allclose(out.pose_covariance_6x6, out.pose_covariance_6x6.T)
eigvals = np.linalg.eigvalsh(out.pose_covariance_6x6)
assert np.all(eigvals > 0), "covariance must be SPD"
assert out.imu_bias is not None
assert out.feature_quality.tracked > 0
# ===========================================================================
# AC-3: backend exceptions rewrap into VioError with __cause__ chain.
@pytest.mark.parametrize(
"fake_exc_cls, expected_facade_exc",
[
(FakeOkvisInitException, VioInitializingError),
(FakeOkvisFatalException, VioFatalError),
],
)
def test_ac3_backend_exceptions_rewrap_to_vio_error_family(
fake_okvis2_binding, fdr_client, fake_exc_cls, expected_facade_exc
) -> None:
config = _config(warm_start_max_frames=1)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
backend.script(ScriptedOutput(raise_with=fake_exc_cls("boom from backend")))
with pytest.raises(expected_facade_exc) as exc_info:
strategy.process_frame(_frame(), _imu_window(), _calibration())
assert isinstance(exc_info.value, VioError)
assert isinstance(exc_info.value.__cause__, fake_exc_cls)
def test_ac3_optimization_exception_during_init_rewraps_to_initializing(
fake_okvis2_binding, fdr_client
) -> None:
config = _config(warm_start_max_frames=5, lost_frame_threshold=9)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
backend.script(ScriptedOutput(raise_with=FakeOkvisOptimizationException("opt fail")))
with pytest.raises(VioInitializingError) as exc_info:
strategy.process_frame(_frame(), _imu_window(), _calibration())
assert isinstance(exc_info.value.__cause__, FakeOkvisOptimizationException)
def test_ac3_unmapped_runtime_error_rewraps_to_vio_fatal(fake_okvis2_binding, fdr_client) -> None:
config = _config(warm_start_max_frames=1)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
backend.script(ScriptedOutput(raise_with=RuntimeError("library leaked this")))
with pytest.raises(VioFatalError) as exc_info:
strategy.process_frame(_frame(), _imu_window(), _calibration())
assert isinstance(exc_info.value.__cause__, RuntimeError)
# ===========================================================================
# AC-4: reset_to_warm_start clears state and seeds the hint; idempotent.
def test_ac4_reset_to_warm_start_clears_and_seeds(fake_okvis2_binding, fdr_client) -> None:
strategy = _build_strategy(fdr_client)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
hint = _warm_start_hint()
strategy.reset_to_warm_start(hint)
assert backend.reset_call_count == 1
health = strategy.health_snapshot()
assert health.state == VioState.INIT
assert health.consecutive_lost == 0
# bias_norm > 0 because the hint carries a non-zero bias
assert health.bias_norm > 0.0
def test_ac4_reset_to_warm_start_is_idempotent(fake_okvis2_binding, fdr_client) -> None:
strategy = _build_strategy(fdr_client)
hint = _warm_start_hint()
strategy.reset_to_warm_start(hint)
strategy.reset_to_warm_start(hint)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
assert backend.reset_call_count == 2
# ===========================================================================
# AC-5: INIT initially -> TRACKING after warm_start_max_frames frames.
def test_ac5_health_snapshot_init_then_tracking(fake_okvis2_binding, fdr_client) -> None:
config = _config(warm_start_max_frames=3)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
# AC-5 invariant: pre-frame snapshot is INIT.
assert strategy.health_snapshot().state == VioState.INIT
# Three successful frames (each "produced=True" -> tracked > threshold).
backend.script(
ScriptedOutput(produced=True),
ScriptedOutput(produced=True),
ScriptedOutput(produced=True),
)
for i in range(3):
strategy.process_frame(
_frame(idx=i + 1, ts_ns=1_000_000_000 + i * 1_000_000),
_imu_window(ts_ns_start=999_000_000 + i * 100_000_000),
_calibration(),
)
assert strategy.health_snapshot().state == VioState.TRACKING
# ===========================================================================
# AC-6: DEGRADED on feature loss; VioOutput STILL emitted (not raised);
# covariance Frobenius norm strictly increases on the degraded frame.
def test_ac6_degraded_on_feature_loss_emits_vio_output(fake_okvis2_binding, fdr_client) -> None:
config = _config(warm_start_max_frames=1)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
# First frame: healthy (tracked >> degraded threshold).
healthy_payload = {
"tracked_features": 80,
"pose_covariance_6x6": np.eye(6, dtype=np.float64) * 0.01,
}
# Second frame: feature loss below the degraded threshold (default 30).
degraded_payload = {
"tracked_features": 5,
"pose_covariance_6x6": np.eye(6, dtype=np.float64) * 0.5,
}
backend.script(
ScriptedOutput(produced=True, payload=healthy_payload),
ScriptedOutput(produced=True, payload=degraded_payload),
)
healthy_out = strategy.process_frame(_frame(idx=1), _imu_window(), _calibration())
degraded_out = strategy.process_frame(
_frame(idx=2, ts_ns=1_100_000_000),
_imu_window(ts_ns_start=1_099_000_000),
_calibration(),
)
assert isinstance(degraded_out, VioOutput), "DEGRADED frame MUST emit output"
assert strategy.health_snapshot().state == VioState.DEGRADED
healthy_norm = np.linalg.norm(healthy_out.pose_covariance_6x6, ord="fro")
degraded_norm = np.linalg.norm(degraded_out.pose_covariance_6x6, ord="fro")
assert degraded_norm > healthy_norm, (
f"Frobenius norm must increase on DEGRADED frame "
f"(healthy={healthy_norm}, degraded={degraded_norm})"
)
# ===========================================================================
# AC-7: After lost_frame_threshold consecutive failures, raise VioFatalError;
# state == LOST.
def test_ac7_sustained_loss_raises_vio_fatal_error(fake_okvis2_binding, fdr_client) -> None:
config = _config(lost_frame_threshold=3, warm_start_max_frames=1)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
# Three consecutive optimization failures.
backend.script(
ScriptedOutput(raise_with=FakeOkvisOptimizationException("loss-1")),
ScriptedOutput(raise_with=FakeOkvisOptimizationException("loss-2")),
ScriptedOutput(raise_with=FakeOkvisOptimizationException("loss-3")),
)
# First 2 are VioInitializingError (degraded path); third hits LOST.
with pytest.raises(VioInitializingError):
strategy.process_frame(_frame(idx=1), _imu_window(), _calibration())
with pytest.raises(VioInitializingError):
strategy.process_frame(
_frame(idx=2, ts_ns=1_100_000_000),
_imu_window(ts_ns_start=1_099_000_000),
_calibration(),
)
with pytest.raises(VioFatalError):
strategy.process_frame(
_frame(idx=3, ts_ns=1_200_000_000),
_imu_window(ts_ns_start=1_199_000_000),
_calibration(),
)
assert strategy.health_snapshot().state == VioState.LOST
# ===========================================================================
# AC-8: BUILD_OKVIS2=OFF lazy-import guarantee — complementary check.
# (Primary AC-8 coverage lives in test_protocol_conformance.py via the
# AZ-331 factory which gates BEFORE constructor.)
def test_ac8_strategy_module_not_imported_at_package_load(
monkeypatch,
) -> None:
"""Importing `c1_vio` itself MUST NOT load `c1_vio.okvis2`.
Risk-2 / I-5 guard the factory respects the BUILD_OKVIS2 flag and
only triggers the import on demand. This complements the
test_ac5_build_vio_strategy_flag_off_no_import test in
test_protocol_conformance.py.
"""
import sys
sys.modules.pop("gps_denied_onboard.components.c1_vio.okvis2", None)
sys.modules.pop("gps_denied_onboard.components.c1_vio", None)
import importlib
importlib.import_module("gps_denied_onboard.components.c1_vio")
assert "gps_denied_onboard.components.c1_vio.okvis2" not in sys.modules
# ===========================================================================
# AC-9: tier2 — honest covariance Frobenius monotonically non-decreasing
# across a controlled-degradation window.
@pytest.mark.tier2
def test_ac9_honest_covariance_monotonic_during_degraded(fake_okvis2_binding, fdr_client) -> None:
"""Tier-2: 60 s controlled-degradation fixture; covariance MUST not
shrink during the DEGRADED window.
The fake binding here exercises the facade's enforcement contract —
real validation against OKVIS2's internal Hessian is the Jetson-side
follow-up that wires :class:`okvis::ThreadedKFVio` (skeleton today).
"""
config = _config(warm_start_max_frames=1)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
# Healthy frame, then 5 DEGRADED frames with non-decreasing covariance.
base_cov = np.eye(6, dtype=np.float64) * 0.01
backend.script(
ScriptedOutput(produced=True, payload={"tracked_features": 80}),
*[
ScriptedOutput(
produced=True,
payload={
"tracked_features": 10,
"pose_covariance_6x6": base_cov * (1.0 + i),
},
)
for i in range(5)
],
)
outputs = []
for i in range(6):
outputs.append(
strategy.process_frame(
_frame(idx=i + 1, ts_ns=1_000_000_000 + i * 1_000_000),
_imu_window(ts_ns_start=999_000_000 + i * 100_000_000),
_calibration(),
)
)
import itertools
degraded_outputs = outputs[1:] # 5 DEGRADED frames
norms = [np.linalg.norm(o.pose_covariance_6x6, ord="fro") for o in degraded_outputs]
for prev, curr in itertools.pairwise(norms):
assert curr >= prev, (
f"covariance Frobenius norm must be monotonically non-decreasing "
f"during DEGRADED; got prev={prev}, curr={curr}"
)
# ===========================================================================
# AC-10: Exactly one vio.health record per state transition; no spam on
# steady-state.
def test_ac10_fdr_vio_health_emitted_per_transition(fake_okvis2_binding, fdr_client) -> None:
config = _config(warm_start_max_frames=1)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
# Drain INIT-on-construct record (the constructor itself does NOT emit;
# the first transition is on the first frame). Document the invariant
# by asserting drain returns empty here.
pre_records = _drain(fdr_client)
assert pre_records == [], "construction must not emit vio.health"
# Sequence: INIT -> TRACKING -> DEGRADED -> back to TRACKING.
backend.script(
ScriptedOutput(produced=True, payload={"tracked_features": 80}),
ScriptedOutput(produced=True, payload={"tracked_features": 80}), # steady
ScriptedOutput(produced=True, payload={"tracked_features": 10}), # DEGRADED
ScriptedOutput(produced=True, payload={"tracked_features": 80}), # TRACKING
)
for i in range(4):
strategy.process_frame(
_frame(idx=i + 1, ts_ns=1_000_000_000 + i * 1_000_000),
_imu_window(ts_ns_start=999_000_000 + i * 100_000_000),
_calibration(),
)
records = _drain(fdr_client)
assert all(r.kind == "vio.health" for r in records)
states = [r.payload["state"] for r in records]
# Expect: INIT -> TRACKING (frame 1), no record on frame 2 steady,
# TRACKING -> DEGRADED (frame 3), DEGRADED -> TRACKING (frame 4).
assert states == ["tracking", "degraded", "tracking"], (
f"unexpected transition sequence: {states}"
)
# ===========================================================================
# NFR-perf (tier2): p95 process_frame <= 80 ms on Tier-2 with real OKVIS2.
@pytest.mark.tier2
def test_nfr_perf_process_frame_p95_under_80ms(fake_okvis2_binding, fdr_client) -> None:
"""Tier-2: Real OKVIS2 binding + Derkachi-class fixture.
The fake binding here measures the Python facade overhead only,
which is the floor under which the real OKVIS2 latency must stay
within budget. On Jetson tier2 this test runs against the real
binding and validates C1-PT-01.
"""
import time
config = _config(warm_start_max_frames=1)
strategy = _build_strategy(fdr_client, config)
backend: FakeOkvis2Backend = strategy._backend # type: ignore[attr-defined]
n = 200
backend.script(*[ScriptedOutput(produced=True) for _ in range(n)])
durations_ms: list[float] = []
for i in range(n):
t0 = time.perf_counter()
strategy.process_frame(
_frame(idx=i + 1, ts_ns=1_000_000_000 + i * 1_000_000),
_imu_window(ts_ns_start=999_000_000 + i * 100_000_000),
_calibration(),
)
durations_ms.append((time.perf_counter() - t0) * 1000.0)
durations_ms.sort()
p95 = durations_ms[int(0.95 * len(durations_ms))]
assert p95 <= 80.0, f"process_frame p95={p95:.3f} ms exceeds C1-PT-01 budget (80 ms)"
# ===========================================================================
# Construction guards.
def test_construct_with_wrong_strategy_label_raises(fake_okvis2_binding, fdr_client) -> None:
"""Constructing directly with a non-okvis2 strategy is a developer bug."""
bad_config = Config.with_blocks(c1_vio=C1VioConfig(strategy="klt_ransac"))
from gps_denied_onboard.components.c1_vio.okvis2 import Okvis2Strategy
with pytest.raises(VioFatalError):
Okvis2Strategy(bad_config, fdr_client=fdr_client)
def test_build_via_factory_returns_okvis2_strategy(
fake_okvis2_binding, fdr_client, monkeypatch
) -> None:
"""End-to-end factory wiring smoke — exercises the BUILD flag gate +
lazy import path the conformance tests don't touch for the real
`Okvis2Strategy` class.
"""
monkeypatch.setenv("BUILD_OKVIS2", "ON")
from gps_denied_onboard.components.c1_vio.okvis2 import Okvis2Strategy
from gps_denied_onboard.runtime_root.vio_factory import build_vio_strategy
instance = build_vio_strategy(_config(), fdr_client=fdr_client)
assert isinstance(instance, Okvis2Strategy)
assert instance.current_strategy_label() == "okvis2"
@@ -0,0 +1,456 @@
"""AZ-331 — C1 VioStrategy Protocol + DTO + error + factory conformance.
Covers all 9 ACs of AZ-331 plus NFR-perf-factory and
NFR-reliability-error-family. The factory ACs (AC-4 / AC-5) substitute
fake strategy modules at ``sys.modules`` boundaries so the test never
touches OKVIS2 / VINS-Mono / OpenCV native libraries.
"""
from __future__ import annotations
import dataclasses
import re
import sys
import time
import types
from pathlib import Path
import gtsam
import numpy as np
import pytest
from gps_denied_onboard._types.nav import (
FeatureQuality,
ImuBias,
VioHealth,
VioOutput,
VioState,
WarmStartPose,
)
from gps_denied_onboard.components.c1_vio import (
C1VioConfig,
VioDegradedError,
VioError,
VioFatalError,
VioInitializingError,
VioStrategy,
)
from gps_denied_onboard.components.c1_vio.config import KNOWN_STRATEGIES
from gps_denied_onboard.config.schema import Config, ConfigError
from gps_denied_onboard.runtime_root.errors import StrategyNotAvailableError
from gps_denied_onboard.runtime_root.vio_factory import build_vio_strategy
_CONTRACT_PATH = (
Path(__file__).resolve().parents[3]
/ "_docs/02_document/contracts/c1_vio/vio_strategy_protocol.md"
)
_STRATEGY_MODULES: dict[str, tuple[str, str, str]] = {
"okvis2": (
"gps_denied_onboard.components.c1_vio.okvis2",
"Okvis2Strategy",
"BUILD_OKVIS2",
),
"vins_mono": (
"gps_denied_onboard.components.c1_vio.vins_mono",
"VinsMonoStrategy",
"BUILD_VINS_MONO",
),
"klt_ransac": (
"gps_denied_onboard.components.c1_vio.klt_ransac",
"KltRansacStrategy",
"BUILD_KLT_RANSAC",
),
}
# ----------------------------------------------------------------------
# Fakes that structurally satisfy the VioStrategy Protocol.
class _FullVioStrategy:
def __init__(self, config: Config, *, fdr_client) -> None:
self.config = config
self.fdr_client = fdr_client
self._label = config.components["c1_vio"].strategy
def process_frame(self, frame, imu, calibration):
raise NotImplementedError
def reset_to_warm_start(self, hint):
return None
def health_snapshot(self):
return VioHealth(state=VioState.INIT, consecutive_lost=0, bias_norm=0.0)
def current_strategy_label(self):
return self._label
class _PartialVioStrategy:
def process_frame(self, frame, imu, calibration):
raise NotImplementedError
def reset_to_warm_start(self, hint):
return None
def _config_with_strategy(strategy: str) -> Config:
return Config.with_blocks(c1_vio=C1VioConfig(strategy=strategy))
def _install_fake_strategy(strategy_label: str) -> type:
module_name, class_name, _flag = _STRATEGY_MODULES[strategy_label]
class _FakeStrategy(_FullVioStrategy):
pass
_FakeStrategy.__name__ = class_name
module = types.ModuleType(module_name)
setattr(module, class_name, _FakeStrategy)
sys.modules[module_name] = module
return _FakeStrategy
@pytest.fixture
def strategy_module_cleanup():
"""Pop every fake strategy module before/after each factory test."""
for module_name, _, _ in _STRATEGY_MODULES.values():
sys.modules.pop(module_name, None)
yield
for module_name, _, _ in _STRATEGY_MODULES.values():
sys.modules.pop(module_name, None)
def _zero_bias() -> ImuBias:
return ImuBias(accel_bias=(0.0, 0.0, 0.0), gyro_bias=(0.0, 0.0, 0.0))
def _neutral_feature_quality() -> FeatureQuality:
return FeatureQuality(tracked=20, new=2, lost=1, mean_parallax=5.0, mre_px=1.0)
def _make_vio_output(frame_id: str = "frame-0001") -> VioOutput:
return VioOutput(
frame_id=frame_id,
relative_pose_T=gtsam.Pose3(np.eye(4)),
pose_covariance_6x6=np.eye(6) * 0.01,
imu_bias=_zero_bias(),
feature_quality=_neutral_feature_quality(),
emitted_at_ns=1_000_000_000,
)
# ----------------------------------------------------------------------
# AC-1: Protocol is conformance-checkable.
def test_ac1_vio_strategy_conformance_full() -> None:
instance = _FullVioStrategy(_config_with_strategy("klt_ransac"), fdr_client=None)
assert isinstance(instance, VioStrategy)
def test_ac1_vio_strategy_conformance_partial_missing_methods() -> None:
assert not isinstance(_PartialVioStrategy(), VioStrategy)
# ----------------------------------------------------------------------
# AC-2: frozen DTOs reject mutation.
@pytest.mark.parametrize(
"dto, field_name, new_value",
[
(_make_vio_output(), "frame_id", "renamed"),
(
VioHealth(state=VioState.TRACKING, consecutive_lost=0, bias_norm=0.0),
"state",
VioState.LOST,
),
(
WarmStartPose(
body_T_world=gtsam.Pose3(np.eye(4)),
velocity_b=(0.0, 0.0, 0.0),
bias=_zero_bias(),
captured_at_ns=1_000_000_000,
),
"captured_at_ns",
0,
),
(_neutral_feature_quality(), "tracked", 0),
],
)
def test_ac2_frozen_dtos_reject_mutation(dto, field_name: str, new_value) -> None:
original_value = getattr(dto, field_name)
with pytest.raises(dataclasses.FrozenInstanceError):
setattr(dto, field_name, new_value)
assert getattr(dto, field_name) == original_value
# ----------------------------------------------------------------------
# AC-3: error hierarchy catchable as a single family.
@pytest.mark.parametrize(
"exc_factory",
[VioInitializingError, VioDegradedError, VioFatalError],
)
def test_ac3_all_vio_errors_caught_as_family(exc_factory) -> None:
with pytest.raises(VioError):
raise exc_factory("boom")
def test_ac3_unrelated_exception_not_caught_as_family() -> None:
with pytest.raises(ValueError):
try:
raise ValueError("not us")
except VioError:
pytest.fail("ValueError must not be caught as VioError")
def test_ac3_strategy_not_available_outside_family() -> None:
with pytest.raises(StrategyNotAvailableError):
try:
raise StrategyNotAvailableError("composition-time")
except VioError:
pytest.fail(
"StrategyNotAvailableError is a composition-root error "
"and MUST NOT be in the c1 VioError family"
)
# ----------------------------------------------------------------------
# AC-4 + AC-5: factory honours config + BUILD flag gate.
@pytest.mark.parametrize("strategy", sorted(_STRATEGY_MODULES))
def test_ac4_build_vio_strategy_returns_protocol_impl(
monkeypatch, strategy_module_cleanup, strategy
) -> None:
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
fake_cls = _install_fake_strategy(strategy)
config = _config_with_strategy(strategy)
instance = build_vio_strategy(config, fdr_client=object())
assert isinstance(instance, fake_cls)
assert isinstance(instance, VioStrategy)
@pytest.mark.parametrize("strategy", sorted(_STRATEGY_MODULES))
def test_ac5_build_vio_strategy_flag_off_no_import(
monkeypatch, strategy_module_cleanup, strategy
) -> None:
module_name, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.delenv(flag, raising=False)
config = _config_with_strategy(strategy)
with pytest.raises(StrategyNotAvailableError) as exc_info:
build_vio_strategy(config, fdr_client=object())
assert strategy in str(exc_info.value)
assert flag in str(exc_info.value)
assert module_name not in sys.modules
# Which strategies still have NO concrete Python module on disk?
# Once an AZ-332 / AZ-333 / AZ-334 implementation lands, the
# `flag_on_but_module_missing` semantic shifts: the factory's import
# succeeds, the constructor fails on missing native binding or other
# prerequisite. We assert the meaningful-error-before-first-frame
# property holds for BOTH cases — the exception class differs by
# strategy.
_STRATEGIES_WITHOUT_PY_MODULE: tuple[str, ...] = ("vins_mono", "klt_ransac")
@pytest.mark.parametrize("strategy", sorted(_STRATEGY_MODULES))
def test_ac5_build_vio_strategy_flag_on_but_module_missing(
monkeypatch, strategy_module_cleanup, strategy
) -> None:
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
config = _config_with_strategy(strategy)
if strategy in _STRATEGIES_WITHOUT_PY_MODULE:
# Module not yet implemented — factory's __import__ raises
# ModuleNotFoundError, rewrapped into StrategyNotAvailableError.
with pytest.raises(StrategyNotAvailableError) as exc_info:
build_vio_strategy(config, fdr_client=object())
assert strategy in str(exc_info.value)
else:
# Module IS implemented (AZ-332). Factory import succeeds, then
# the strategy constructor fails on missing native binding —
# which the strategy MUST surface as VioFatalError BEFORE any
# frame is processed (the AC-5 spirit: no silent fall-through).
with pytest.raises(VioFatalError) as exc_info:
build_vio_strategy(config, fdr_client=object())
assert "native binding" in str(exc_info.value)
# ----------------------------------------------------------------------
# AC-6: unknown strategy label rejected at config load.
@pytest.mark.parametrize(
"bad_label",
["openvslam", "orbslam3", "OKVIS2", "okvis", ""],
)
def test_ac6_unknown_strategy_rejected_at_config_load(bad_label: str) -> None:
with pytest.raises(ConfigError) as exc_info:
C1VioConfig(strategy=bad_label)
msg = str(exc_info.value)
for valid in KNOWN_STRATEGIES:
assert valid in msg
# ----------------------------------------------------------------------
# AC-7: current_strategy_label() matches config exactly.
@pytest.mark.parametrize("strategy", sorted(_STRATEGY_MODULES))
def test_ac7_current_strategy_label_matches_config(
monkeypatch, strategy_module_cleanup, strategy
) -> None:
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
_install_fake_strategy(strategy)
config = _config_with_strategy(strategy)
instance = build_vio_strategy(config, fdr_client=object())
assert instance.current_strategy_label() == strategy
assert instance.current_strategy_label() == config.components["c1_vio"].strategy
# ----------------------------------------------------------------------
# AC-8: contract file matches Protocol shape.
_METHOD_TABLE_RE = re.compile(r"^\|\s*`(?P<name>[a-z_][a-z0-9_]*)`\s*\|", re.MULTILINE)
def _methods_from_contract() -> set[str]:
text = _CONTRACT_PATH.read_text(encoding="utf-8")
surface_start = text.index("### Protocol surface")
next_section = text.find("\n### ", surface_start + len("### Protocol surface"))
section = text[surface_start:next_section] if next_section != -1 else text[surface_start:]
return {m.group("name") for m in _METHOD_TABLE_RE.finditer(section)}
def _protocol_methods(proto: type) -> set[str]:
return {
name for name in dir(proto) if not name.startswith("_") and callable(getattr(proto, name))
}
def test_ac8_contract_methods_match_protocol() -> None:
contract_methods = _methods_from_contract()
protocol_methods = _protocol_methods(VioStrategy)
missing_in_protocol = contract_methods - protocol_methods
missing_in_contract = protocol_methods - contract_methods
assert not missing_in_protocol, (
"Methods declared in vio_strategy_protocol.md Shape section but "
f"missing from the Protocol: {sorted(missing_in_protocol)}"
)
assert not missing_in_contract, (
"Methods present on the Protocol but missing from the contract "
f"Shape section: {sorted(missing_in_contract)}"
)
def test_ac8_contract_lists_all_three_error_subtypes() -> None:
text = _CONTRACT_PATH.read_text(encoding="utf-8")
for name in {"VioInitializingError", "VioDegradedError", "VioFatalError"}:
assert name in text, f"Contract file is missing the documented error subtype {name!r}"
# ----------------------------------------------------------------------
# AC-9: VioOutput.frame_id echo invariant is typed.
def test_ac9_vio_output_frame_id_is_typed_str() -> None:
"""``VioOutput.frame_id`` annotation is ``str`` per AZ-331 AC-9.
With ``from __future__ import annotations`` PEP-563 stringifies
every annotation at module load, so ``__annotations__`` returns
the literal ``'str'``. Compare against the string to avoid the
full ``get_type_hints`` forward-ref resolution path (which would
try to resolve neighbouring TYPE_CHECKING-only names like
:class:`SE3`).
"""
annotation = VioOutput.__annotations__["frame_id"]
assert annotation == "str", f"frame_id annotation should be 'str'; got {annotation!r}"
def test_ac9_vio_output_docstring_documents_echo_invariant() -> None:
docstring = VioOutput.__doc__ or ""
assert "echo" in docstring.lower(), (
"VioOutput docstring must document the frame_id echo invariant "
"(MUST equal NavCameraFrame.frame_id from the input frame)"
)
assert "frame_id" in docstring.lower()
# ----------------------------------------------------------------------
# NFRs.
@pytest.mark.parametrize(
"exc_type",
[VioInitializingError, VioDegradedError, VioFatalError],
)
def test_nfr_reliability_all_vio_errors_subclass_family(exc_type) -> None:
assert issubclass(exc_type, VioError)
def test_nfr_reliability_strategy_not_available_not_in_family() -> None:
assert not issubclass(StrategyNotAvailableError, VioError)
def test_nfr_perf_factory_under_200ms_p99(monkeypatch, strategy_module_cleanup) -> None:
"""Factory p99 ≤ 200 ms across 1000 calls (NFR-perf-factory)."""
strategy = "klt_ransac"
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
_install_fake_strategy(strategy)
config = _config_with_strategy(strategy)
durations_ms: list[float] = []
for _ in range(1000):
t0 = time.perf_counter()
build_vio_strategy(config, fdr_client=object())
durations_ms.append((time.perf_counter() - t0) * 1000.0)
durations_ms.sort()
p99 = durations_ms[int(0.99 * len(durations_ms))]
assert p99 <= 200.0, f"build_vio_strategy() p99={p99:.3f} ms exceeds 200 ms NFR"
# ----------------------------------------------------------------------
# Surface coverage.
def test_vio_state_enum_surface() -> None:
assert {v.value for v in VioState} == {"init", "tracking", "degraded", "lost"}
def test_c1_config_lost_frame_threshold_validation() -> None:
with pytest.raises(ConfigError):
C1VioConfig(lost_frame_threshold=0)
with pytest.raises(ConfigError):
C1VioConfig(lost_frame_threshold=-1)
def test_c1_config_warm_start_max_frames_validation() -> None:
with pytest.raises(ConfigError):
C1VioConfig(warm_start_max_frames=0)
def test_feature_quality_dto_constructs_and_freezes() -> None:
fq = _neutral_feature_quality()
with pytest.raises(dataclasses.FrozenInstanceError):
fq.mre_px = 99.0 # type: ignore[misc]
def test_warm_start_pose_constructs_with_zero_bias() -> None:
hint = WarmStartPose(
body_T_world=gtsam.Pose3(np.eye(4)),
velocity_b=(0.0, 0.0, 0.0),
bias=_zero_bias(),
captured_at_ns=1_000_000_000,
)
assert hint.captured_at_ns == 1_000_000_000
assert hint.bias.accel_bias == (0.0, 0.0, 0.0)
-9
View File
@@ -1,9 +0,0 @@
"""C1 VIO smoke test — AZ-263 AC-9: verify the component interface is importable."""
def test_interface_importable() -> None:
# Assert
from gps_denied_onboard.components.c1_vio import VioOutput, VioStrategy
assert VioStrategy is not None
assert VioOutput is not None
@@ -0,0 +1,889 @@
"""AZ-343 — :class:`InlierCountReRanker` acceptance + NFR coverage.
Covers AC-1..AC-12 from the task spec at
``_docs/02_tasks/todo/AZ-343_c2_5_inlier_count_reranker.md``.
Performance NFR (C2.5-PT-01 p95 80 ms for 10 single-pair LightGlue
passes against the real TRT engine) is deferred to Step 9 / E-BBT per
the task's "Excluded" section — the harness here uses test doubles
that bypass real GPU work.
"""
from __future__ import annotations
import logging
from dataclasses import dataclass
import numpy as np
import pytest
from gps_denied_onboard._types.matching import CorrespondenceSet, KeypointSet
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.rerank import RerankResult
from gps_denied_onboard._types.vpr import VprCandidate, VprResult
from gps_denied_onboard.components.c2_5_rerank import (
C2_5RerankConfig,
RerankAllCandidatesFailedError,
ReRankStrategy,
)
from gps_denied_onboard.components.c2_5_rerank.inlier_based_reranker import (
InlierCountReRanker,
create,
)
from gps_denied_onboard.components.c6_tile_cache import TilePixelHandle
from gps_denied_onboard.components.c6_tile_cache.errors import TileNotFoundError
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrRecord
from gps_denied_onboard.helpers.feature_extractor import FeatureExtractorError
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntimeError
# ----------------------------------------------------------------------
# Test doubles
@dataclass
class _FakeClock:
_t: int = 1_700_000_000_000_000_000
def monotonic_ns(self) -> int:
self._t += 1
return self._t
def time_ns(self) -> int:
return self._t
def sleep_until_ns(self, target_ns: int) -> None:
return None
class _FakeTilePixelHandle(TilePixelHandle):
"""Reusable :class:`TilePixelHandle` — supports multi-shot ``with`` blocks.
The buffer is mutable so AC-7 can prove identity (mutation through
one ``with`` block must be visible through the next).
"""
def __init__(self, jpeg_bytes: bytes, path):
self._buf = bytearray(jpeg_bytes)
self._path = path
@property
def filesystem_path(self):
return self._path
def __enter__(self) -> memoryview:
return memoryview(self._buf)
def __exit__(self, exc_type, exc_val, exc_tb) -> None:
return None
def mutate(self, new_bytes: bytes) -> None:
self._buf = bytearray(new_bytes)
def _synthesise_jpeg(seed: int) -> bytes:
"""Produce a deterministic colour JPEG keyed off ``seed``."""
import cv2
rng = np.random.default_rng(seed)
image = rng.integers(0, 255, size=(32, 32, 3), dtype=np.uint8)
ok, buf = cv2.imencode(".jpg", image)
assert ok, "cv2.imencode failed in test fixture"
return bytes(buf)
class _FakeTileStore:
"""Returns deterministic handles per ``tile_id``; can be told to fail."""
def __init__(self):
from pathlib import Path
self._handles: dict[tuple, _FakeTilePixelHandle] = {}
self._fail: set[tuple] = set()
self._path_base = Path("/tmp/c2_5_rerank_fake")
def install(self, tile_id, *, fail: bool = False, jpeg_seed: int | None = None) -> None:
if fail:
self._fail.add(tile_id)
return
if jpeg_seed is None:
jpeg_seed = hash(tile_id) & 0xFFFF
self._handles[tile_id] = _FakeTilePixelHandle(
jpeg_bytes=_synthesise_jpeg(jpeg_seed),
path=self._path_base / f"{tile_id}.jpg",
)
def handle(self, tile_id) -> _FakeTilePixelHandle:
return self._handles[tile_id]
def read_tile_pixels(self, tile_id):
if tile_id in self._fail:
raise TileNotFoundError(f"fake: {tile_id} marked as failing")
return self._handles[tile_id]
def write_tile(self, tile_blob, metadata):
raise NotImplementedError
def tile_exists(self, tile_id):
return tile_id in self._handles
def delete_tile(self, tile_id):
return self._handles.pop(tile_id, None) is not None
class _FakeFeatureExtractor:
"""Returns a deterministic :class:`KeypointSet` per image; can fail."""
def __init__(self) -> None:
self._fail_calls: set[int] = set()
self._call_count = 0
def fail_on(self, call_index: int) -> None:
self._fail_calls.add(call_index)
def descriptor_dim(self) -> int:
return 256
def extract(self, image_bgr: np.ndarray) -> KeypointSet:
idx = self._call_count
self._call_count += 1
if idx in self._fail_calls:
raise FeatureExtractorError(f"fake extractor failing on call {idx}")
return KeypointSet(
keypoints=np.zeros((4, 2), dtype=np.float32),
descriptors=np.zeros((4, 256), dtype=np.float32),
)
class _ProgrammableLightGlue:
"""Returns the next pre-programmed :class:`CorrespondenceSet`; can raise."""
def __init__(self) -> None:
self._calls: list[
tuple[KeypointSet, KeypointSet]
] = []
self._results: list[object] = [] # CorrespondenceSet | Exception
def queue_inliers(self, count: int) -> None:
self._results.append(_make_correspondence_set(count))
def queue_error(self, exc: BaseException) -> None:
self._results.append(exc)
def descriptor_dim(self) -> int:
return 256
def match(self, features_a: KeypointSet, features_b: KeypointSet) -> CorrespondenceSet:
self._calls.append((features_a, features_b))
if not self._results:
raise AssertionError(
"fake LightGlue ran out of programmed responses; queue more"
)
result = self._results.pop(0)
if isinstance(result, BaseException):
raise result
return result
def match_batch(self, features_a_list, features_b_list):
raise NotImplementedError
@property
def calls(self) -> list[tuple[KeypointSet, KeypointSet]]:
return self._calls
class _CapturingFdrClient:
def __init__(self) -> None:
self.records: list[FdrRecord] = []
def enqueue(self, record: FdrRecord) -> None:
self.records.append(record)
def _make_correspondence_set(count: int) -> CorrespondenceSet:
return CorrespondenceSet(
correspondences=np.zeros((count, 4), dtype=np.float32),
scores=np.full((count,), 0.5, dtype=np.float32),
)
def _make_frame(frame_id: int = 7) -> NavCameraFrame:
from datetime import datetime, timezone
image = (np.random.default_rng(frame_id).integers(0, 255, (16, 16, 3))).astype(
np.uint8
)
return NavCameraFrame(
frame_id=frame_id,
timestamp=datetime.now(tz=timezone.utc),
image=image,
camera_calibration_id="cam0",
)
def _make_vpr_candidate(*, tile_id, distance: float) -> VprCandidate:
return VprCandidate(tile_id=tile_id, descriptor_distance=distance, descriptor_dim=256)
def _make_vpr_result(*, frame_id: int, candidates: list[VprCandidate]) -> VprResult:
return VprResult(
frame_id=frame_id,
candidates=tuple(candidates),
retrieved_at=10,
backbone_label="ultra_vpr",
)
def _build_reranker(
*,
tile_store: _FakeTileStore,
extractor: _FakeFeatureExtractor,
lightglue: _ProgrammableLightGlue,
fdr_client=None,
top_n: int = 3,
debug_per_frame_log: bool = False,
) -> InlierCountReRanker:
config = Config.with_blocks(
c2_5_rerank=C2_5RerankConfig(
strategy="inlier_count",
top_n=top_n,
debug_per_frame_log=debug_per_frame_log,
)
)
return InlierCountReRanker(
config=config,
tile_store=tile_store,
lightglue_runtime=lightglue,
feature_extractor=extractor,
clock=_FakeClock(),
fdr_client=fdr_client,
)
def _install_k_candidates(
tile_store: _FakeTileStore,
*,
k: int,
distances: list[float] | None = None,
fail_indices: set[int] | None = None,
) -> list[VprCandidate]:
distances = distances or [0.1 * i for i in range(k)]
fail_indices = fail_indices or set()
candidates: list[VprCandidate] = []
for i in range(k):
tile_id = (18, 49.0 + i * 0.001, 36.0 + i * 0.001)
tile_store.install(tile_id, fail=i in fail_indices, jpeg_seed=i)
candidates.append(_make_vpr_candidate(tile_id=tile_id, distance=distances[i]))
return candidates
# ----------------------------------------------------------------------
# Calibration fixture (the strategy ignores it for now — Protocol shape only).
@pytest.fixture
def calibration():
from gps_denied_onboard._types.calibration import CameraCalibration
return CameraCalibration(
camera_id="cam0",
intrinsics_3x3=np.eye(3, dtype=np.float32),
distortion=np.zeros((5,), dtype=np.float32),
body_to_camera_se3=np.eye(4, dtype=np.float32),
acquisition_method="synthetic",
)
# ----------------------------------------------------------------------
# AC-1 — Protocol conformance.
def test_ac1_isinstance_rerank_strategy(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=_ProgrammableLightGlue(),
)
# Assert
assert isinstance(reranker, ReRankStrategy)
assert hasattr(reranker, "rerank")
# ----------------------------------------------------------------------
# AC-2 — top-N ordering with mixed inlier counts + ties + zeros.
def test_ac2_top_n_ordering_and_tie_break(calibration) -> None:
# Arrange
inlier_counts = [412, 198, 287, 153, 287, 0, 65, 412, 89, 234]
descriptor_distances = [0.1, 0.4, 0.2, 0.3, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
tile_store = _FakeTileStore()
candidates = _install_k_candidates(
tile_store, k=10, distances=descriptor_distances
)
lightglue = _ProgrammableLightGlue()
for count in inlier_counts:
lightglue.queue_inliers(count)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
top_n=3,
)
vpr_result = _make_vpr_result(frame_id=12, candidates=candidates)
# Act
result = reranker.rerank(_make_frame(12), vpr_result, n=3, calibration=calibration)
# Assert
assert len(result.candidates) == 3
assert result.candidates[0].inlier_count == 412
assert result.candidates[0].descriptor_distance == pytest.approx(0.1)
assert result.candidates[1].inlier_count == 412
assert result.candidates[1].descriptor_distance == pytest.approx(0.8)
assert result.candidates[2].inlier_count == 287
assert result.candidates[2].descriptor_distance == pytest.approx(0.2)
# Zero-inlier candidate is dropped; candidates_dropped accounts for it.
assert result.candidates_dropped >= 1
# ----------------------------------------------------------------------
# AC-3 — drop-and-continue on LightGlue failure.
def test_ac3_drop_and_continue_on_backbone_error(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
for i in range(10):
if i == 3:
lightglue.queue_error(LightGlueRuntimeError("boom"))
else:
lightglue.queue_inliers(100 + i)
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=21, candidates=candidates)
# Act
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_5_rerank"):
result = reranker.rerank(
_make_frame(21), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 3
assert result.candidates_dropped >= 1
backbone_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.backbone_error"
]
assert len(backbone_errors) == 1
assert getattr(backbone_errors[0], "kv", {}).get("reason") == "lightglue_forward_failed"
backbone_fdr = [r for r in fdr.records if r.kind == "rerank.backbone_error"]
assert len(backbone_fdr) == 1
# ----------------------------------------------------------------------
# AC-4 — drop-and-continue on TileStore failure.
def test_ac4_drop_and_continue_on_tile_fetch_error(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10, fail_indices={6})
lightglue = _ProgrammableLightGlue()
for _ in range(9):
lightglue.queue_inliers(200)
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=42, candidates=candidates)
# Act
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_5_rerank"):
result = reranker.rerank(
_make_frame(42), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 3
assert result.candidates_dropped >= 1
tile_fetch_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.tile_fetch_error"
]
assert len(tile_fetch_errors) == 1
tile_fetch_fdr = [r for r in fdr.records if r.kind == "rerank.tile_fetch_error"]
assert len(tile_fetch_fdr) == 1
# ----------------------------------------------------------------------
# AC-5 — zero survivors raises RerankAllCandidatesFailedError.
def test_ac5_zero_survivors_raises(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
for _ in range(10):
lightglue.queue_error(LightGlueRuntimeError("everything-fails"))
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=99, candidates=candidates)
# Act / Assert
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_5_rerank"):
with pytest.raises(RerankAllCandidatesFailedError):
reranker.rerank(
_make_frame(99), vpr_result, n=3, calibration=calibration
)
backbone_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.backbone_error"
]
assert len(backbone_errors) == 10
all_failed = [
r for r in caplog.records if r.message == "c2_5.rerank.all_failed"
]
assert len(all_failed) == 1
all_failed_fdr = [r for r in fdr.records if r.kind == "rerank.all_failed"]
assert len(all_failed_fdr) == 1
payload = all_failed_fdr[0].payload
assert payload["candidates_input"] == 10
assert payload["candidates_dropped"] == 10
# ----------------------------------------------------------------------
# AC-6 — fewer than N survivors → WARN log + partial result.
def test_ac6_fewer_than_n_survivors_warn(calibration, caplog) -> None:
# Arrange — 8 fail, 2 succeed.
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
# Two succeed, six fail with LightGlueRuntimeError, two return zero inliers.
success_indices = {0, 5}
zero_indices = {2, 8}
for i in range(10):
if i in success_indices:
lightglue.queue_inliers(300 + i)
elif i in zero_indices:
lightglue.queue_inliers(0)
else:
lightglue.queue_error(LightGlueRuntimeError("bad"))
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
)
vpr_result = _make_vpr_result(frame_id=55, candidates=candidates)
# Act
with caplog.at_level(logging.WARNING, logger="gps_denied_onboard.c2_5_rerank"):
result = reranker.rerank(
_make_frame(55), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 2
assert result.candidates_dropped == 8
warn_records = [
r for r in caplog.records if r.message == "c2_5.rerank.fewer_than_n_survivors"
]
assert len(warn_records) == 1
assert getattr(warn_records[0], "kv", {}).get("requested") == 3
assert getattr(warn_records[0], "kv", {}).get("returned") == 2
assert getattr(warn_records[0], "kv", {}).get("dropped") == 8
# ----------------------------------------------------------------------
# AC-7 — tile_pixels_handle is a reference, not a copy.
def test_ac7_tile_pixels_handle_is_reference(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=3)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(500)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
top_n=3,
)
vpr_result = _make_vpr_result(frame_id=1, candidates=candidates)
# Act
result = reranker.rerank(
_make_frame(1), vpr_result, n=3, calibration=calibration
)
# Assert — identity preservation against the TileStore-returned handle.
for survivor in result.candidates:
original = tile_store.handle(survivor.tile_id)
assert survivor.tile_pixels_handle is original
# ----------------------------------------------------------------------
# AC-8 — descriptor_distance carried forward unchanged.
def test_ac8_descriptor_distance_preserved(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
distance = 0.123456789
candidates = _install_k_candidates(
tile_store, k=3, distances=[distance, 0.2, 0.3]
)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(700)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
)
vpr_result = _make_vpr_result(frame_id=2, candidates=candidates)
# Act
result = reranker.rerank(
_make_frame(2), vpr_result, n=3, calibration=calibration
)
# Assert
top_tile = candidates[0].tile_id
matching = [c for c in result.candidates if c.tile_id == top_tile]
assert matching
assert matching[0].descriptor_distance == distance
# ----------------------------------------------------------------------
# AC-9 — deterministic same-inputs → bit-identical RerankResult.candidates.
def test_ac9_deterministic_candidates(calibration) -> None:
# Arrange — single reranker instance called three times so the
# injected clock advances between calls (AC-9: reranked_at MUST
# differ across calls but candidates MUST NOT).
counts = [40, 90, 70, 10, 60, 30, 80, 20, 50, 100]
distances = [0.1 * i for i in range(10)]
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10, distances=distances)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
for c in counts:
lightglue.queue_inliers(c)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
)
vpr_result = _make_vpr_result(frame_id=314, candidates=candidates)
# Act
runs: list[RerankResult] = [
reranker.rerank(_make_frame(314), vpr_result, n=3, calibration=calibration)
for _ in range(3)
]
# Assert
triples = [
tuple((c.tile_id, c.inlier_count, c.descriptor_distance) for c in r.candidates)
for r in runs
]
assert triples[0] == triples[1] == triples[2]
# reranked_at differs across calls because Clock.monotonic_ns advances.
assert runs[0].reranked_at != runs[1].reranked_at
assert runs[1].reranked_at != runs[2].reranked_at
# ----------------------------------------------------------------------
# AC-10 — composition-root wiring via the AZ-342 factory.
def test_ac10_composition_root_wiring(monkeypatch, caplog) -> None:
# Arrange — reuse the module already imported at file top so the
# class identity matches; the factory's lazy import picks it up
# from sys.modules unchanged.
monkeypatch.setenv("BUILD_RERANK_INLIER_COUNT", "ON")
from gps_denied_onboard.runtime_root.rerank_factory import build_rerank_strategy
config = Config.with_blocks(
c2_5_rerank=C2_5RerankConfig(strategy="inlier_count", top_n=3)
)
tile_store = _FakeTileStore()
extractor = _FakeFeatureExtractor()
lightglue = _ProgrammableLightGlue()
clock = _FakeClock()
# Act
with caplog.at_level(logging.INFO, logger="gps_denied_onboard.c2_5_rerank"):
instance = build_rerank_strategy(
config,
tile_store=tile_store,
lightglue_runtime=lightglue,
feature_extractor=extractor,
clock=clock,
)
# Assert
assert isinstance(instance, InlierCountReRanker)
assert isinstance(instance, ReRankStrategy)
assert instance._lightglue_runtime is lightglue
ready_logs = [r for r in caplog.records if r.message == "c2_5.rerank.ready"]
assert len(ready_logs) == 1
kv = getattr(ready_logs[0], "kv", {})
assert kv.get("strategy") == "inlier_count"
assert kv.get("N") == 3
assert kv.get("K") == 10
# ----------------------------------------------------------------------
# AC-11 — FDR rerank.frame_done emission per frame.
def test_ac11_frame_done_fdr_emission(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
successes = [412, 287, 198] + [10] * 7 # top three survive ranking.
for c in successes:
lightglue.queue_inliers(c)
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=77, candidates=candidates)
# Act
result = reranker.rerank(
_make_frame(77), vpr_result, n=3, calibration=calibration
)
# Assert
frame_done = [r for r in fdr.records if r.kind == "rerank.frame_done"]
assert len(frame_done) == 1
payload = frame_done[0].payload
assert payload["frame_id"] == 77
assert payload["candidates_input"] == 10
assert payload["top_inlier_count"] == result.candidates[0].inlier_count
# ----------------------------------------------------------------------
# AC-12 — single-pair LightGlue invocation count.
def test_ac12_single_pair_lightglue_called_exactly_k_times(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
for i in range(10):
lightglue.queue_inliers(10 + i)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
)
vpr_result = _make_vpr_result(frame_id=88, candidates=candidates)
# Act
reranker.rerank(_make_frame(88), vpr_result, n=3, calibration=calibration)
# Assert
assert len(lightglue.calls) == 10
first_query = lightglue.calls[0][0]
for query, _ in lightglue.calls[1:]:
assert query is first_query
# ----------------------------------------------------------------------
# Mixed drop-and-continue smoke (Risk-1 / Risk-2 coverage).
def test_drop_and_continue_mixed_failures(calibration, caplog) -> None:
# Arrange — 1 TileFetch failure, 1 LightGlue failure, 2 zero-inliers, 6 succeed.
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10, fail_indices={2})
lightglue = _ProgrammableLightGlue()
# Index 2 is dropped at tile fetch; the remaining 9 indices feed LightGlue.
counts_for_remaining = [50, 75, 25, 100, 0, 80, 90, 0] # 8 entries for indices 0,1,3,4,5,6,7,8
# Index 9 hits a LightGlue error.
plan: list[object] = []
rem_iter = iter(counts_for_remaining)
for i in range(10):
if i == 2:
continue
if i == 9:
plan.append(LightGlueRuntimeError("backbone-died"))
else:
plan.append(next(rem_iter))
for item in plan:
if isinstance(item, Exception):
lightglue.queue_error(item)
else:
lightglue.queue_inliers(item)
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=66, candidates=candidates)
# Act
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_5_rerank"):
result = reranker.rerank(
_make_frame(66), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 3
# 1 tile-fetch + 1 backbone + 2 zero-inliers = 4 drops.
assert result.candidates_dropped == 4
backbone_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.backbone_error"
]
assert len(backbone_errors) == 1
tile_fetch_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.tile_fetch_error"
]
assert len(tile_fetch_errors) == 1
assert any(r.kind == "rerank.backbone_error" for r in fdr.records)
assert any(r.kind == "rerank.tile_fetch_error" for r in fdr.records)
assert any(r.kind == "rerank.frame_done" for r in fdr.records)
# ----------------------------------------------------------------------
# Public API — ``InlierCountReRanker`` stays out of c2_5_rerank.__all__ (AC-8).
def test_inlier_count_reranker_not_publicly_re_exported() -> None:
# Arrange / Act
from gps_denied_onboard.components import c2_5_rerank
# Assert
assert "InlierCountReRanker" not in c2_5_rerank.__all__
# ----------------------------------------------------------------------
# Module-level create() is the factory entry-point (Outcome step 5).
def test_create_returns_inlier_count_reranker() -> None:
# Arrange
config = Config.with_blocks(
c2_5_rerank=C2_5RerankConfig(strategy="inlier_count", top_n=3)
)
# Act
instance = create(
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_ProgrammableLightGlue(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
# Assert
assert isinstance(instance, InlierCountReRanker)
assert isinstance(instance, ReRankStrategy)
# ----------------------------------------------------------------------
# Health: no_input_candidates short-circuit also raises.
def test_zero_input_candidates_short_circuits(calibration) -> None:
# Arrange
reranker = _build_reranker(
tile_store=_FakeTileStore(),
extractor=_FakeFeatureExtractor(),
lightglue=_ProgrammableLightGlue(),
)
vpr_result = _make_vpr_result(frame_id=5, candidates=[])
# Act / Assert
with pytest.raises(RerankAllCandidatesFailedError):
reranker.rerank(
_make_frame(5), vpr_result, n=3, calibration=calibration
)
# ----------------------------------------------------------------------
# DEBUG gating — per-frame frame_done DEBUG only fires when configured on.
def test_debug_per_frame_log_gated_off_by_default(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=3)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(100)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
debug_per_frame_log=False,
)
vpr_result = _make_vpr_result(frame_id=33, candidates=candidates)
# Act
with caplog.at_level(logging.DEBUG, logger="gps_denied_onboard.c2_5_rerank"):
reranker.rerank(_make_frame(33), vpr_result, n=3, calibration=calibration)
# Assert
debug_records = [
r for r in caplog.records if r.message == "c2_5.rerank.frame_done"
]
assert debug_records == []
def test_debug_per_frame_log_emits_when_enabled(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=3)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(100)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
debug_per_frame_log=True,
)
vpr_result = _make_vpr_result(frame_id=34, candidates=candidates)
# Act
with caplog.at_level(logging.DEBUG, logger="gps_denied_onboard.c2_5_rerank"):
reranker.rerank(_make_frame(34), vpr_result, n=3, calibration=calibration)
# Assert
debug_records = [
r for r in caplog.records if r.message == "c2_5.rerank.frame_done"
]
assert len(debug_records) == 1
# ----------------------------------------------------------------------
# FDR enqueue failures must NEVER promote to drop events (observability-only).
def test_fdr_enqueue_failure_is_swallowed(calibration) -> None:
# Arrange
class _BrokenFdr:
def enqueue(self, record):
raise RuntimeError("queue broken")
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=3)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(100)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=_BrokenFdr(),
)
vpr_result = _make_vpr_result(frame_id=99, candidates=candidates)
# Act
result = reranker.rerank(
_make_frame(99), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 3
@@ -0,0 +1,454 @@
"""AZ-342 — C2.5 ReRankStrategy Protocol + DTO + error + factory conformance.
Covers AC-1..AC-8 + AC-11 + NFRs. AC-9 (single-thread binding) and
AC-10 (LightGlueRuntime identity-share between C2.5 and C3) are
deferred per the task spec's Risk-4 escape clause — the generic
compose_root thread-binding registry and the cross-factory helper
identity assertion live with AZ-270 and the future C3 protocol task
(AZ-344). Each factory owns its own thread binding today.
"""
from __future__ import annotations
import dataclasses
import logging
import sys
import time
import types
import pytest
from gps_denied_onboard._types.rerank import RerankCandidate, RerankResult
from gps_denied_onboard.components.c2_5_rerank import (
C2_5RerankConfig,
ReRankStrategy,
RerankAllCandidatesFailedError,
RerankBackboneError,
RerankError,
)
from gps_denied_onboard.components.c2_5_rerank.config import KNOWN_STRATEGIES
from gps_denied_onboard.config.schema import Config, ConfigError
from gps_denied_onboard.runtime_root.errors import StrategyNotAvailableError
from gps_denied_onboard.runtime_root.rerank_factory import build_rerank_strategy
_STRATEGY_MODULES: dict[str, tuple[str, str, str]] = {
"inlier_count": (
"gps_denied_onboard.components.c2_5_rerank.inlier_based_reranker",
"InlierCountReRanker",
"BUILD_RERANK_INLIER_COUNT",
),
}
# ----------------------------------------------------------------------
# Fakes that structurally satisfy the ReRankStrategy Protocol.
class _FakeTileStore:
def read_tile_pixels(self, tile_id):
raise NotImplementedError
def write_tile(self, tile_blob, metadata):
raise NotImplementedError
def tile_exists(self, tile_id):
return False
def delete_tile(self, tile_id):
return False
class _FakeLightGlueRuntime:
def descriptor_dim(self):
return 256
def match(self, features_a, features_b):
raise NotImplementedError
def match_batch(self, features_a_list, features_b_list):
raise NotImplementedError
class _FakeFeatureExtractor:
def descriptor_dim(self):
return 256
def extract(self, image_bgr):
raise NotImplementedError
class _FakeClock:
def __init__(self) -> None:
self._t = 1_000_000_000
def monotonic_ns(self):
self._t += 1
return self._t
def time_ns(self):
return self._t
def sleep_until_ns(self, target_ns):
return None
class _FullReRankStrategy:
def __init__(
self,
config,
*,
tile_store,
lightglue_runtime,
feature_extractor=None,
clock=None,
fdr_client=None,
) -> None:
self._config = config
self._tile_store = tile_store
self._lightglue_runtime = lightglue_runtime
self._feature_extractor = feature_extractor
self._clock = clock
self._fdr_client = fdr_client
self._label = config.components["c2_5_rerank"].strategy
def rerank(self, frame, vpr_result, n, calibration):
return RerankResult(
frame_id=getattr(frame, "frame_id", 0),
candidates=tuple(),
reranked_at=1_000_000_000,
rerank_label=self._label,
candidates_input=0,
candidates_dropped=0,
)
class _PartialReRankStrategy:
pass
def _config_with_strategy(strategy: str = "inlier_count") -> Config:
return Config.with_blocks(c2_5_rerank=C2_5RerankConfig(strategy=strategy))
def _install_fake_strategy(strategy_label: str) -> type:
module_name, class_name, _flag = _STRATEGY_MODULES[strategy_label]
class _FakeStrategy(_FullReRankStrategy):
pass
_FakeStrategy.__name__ = class_name
module = types.ModuleType(module_name)
setattr(module, class_name, _FakeStrategy)
sys.modules[module_name] = module
return _FakeStrategy
@pytest.fixture
def strategy_module_cleanup():
for module_name, _, _ in _STRATEGY_MODULES.values():
sys.modules.pop(module_name, None)
yield
for module_name, _, _ in _STRATEGY_MODULES.values():
sys.modules.pop(module_name, None)
# ----------------------------------------------------------------------
# AC-1: Protocol conformance.
def test_ac1_rerank_strategy_conformance_full() -> None:
instance = _FullReRankStrategy(
_config_with_strategy(),
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
clock=_FakeClock(),
)
assert isinstance(instance, ReRankStrategy)
def test_ac1_rerank_strategy_conformance_partial_missing_methods() -> None:
assert not isinstance(_PartialReRankStrategy(), ReRankStrategy)
# ----------------------------------------------------------------------
# AC-2: frozen+slotted DTOs.
def _make_candidate() -> RerankCandidate:
return RerankCandidate(
tile_id=(18, 49.9, 36.3),
inlier_count=42,
descriptor_distance=0.123,
descriptor_dim=512,
tile_pixels_handle=object(),
)
def _make_result(frame_id: int = 7) -> RerankResult:
return RerankResult(
frame_id=frame_id,
candidates=(_make_candidate(),),
reranked_at=1_000_000_000,
rerank_label="inlier_count",
candidates_input=10,
candidates_dropped=9,
)
@pytest.mark.parametrize(
"dto, field_name, new_value",
[
(_make_candidate(), "inlier_count", 99),
(_make_result(), "rerank_label", "learned_reranker"),
],
)
def test_ac2_frozen_dtos_reject_mutation(dto, field_name: str, new_value) -> None:
original_value = getattr(dto, field_name)
with pytest.raises(dataclasses.FrozenInstanceError):
setattr(dto, field_name, new_value)
assert getattr(dto, field_name) == original_value
@pytest.mark.parametrize("cls", [RerankCandidate, RerankResult])
def test_ac2_dtos_have_slots(cls) -> None:
assert hasattr(cls, "__slots__")
assert cls.__slots__
instance = _make_candidate() if cls is RerankCandidate else _make_result()
assert not hasattr(instance, "__dict__"), (
f"{cls.__name__} carries a __dict__ — slots=True is missing"
)
# ----------------------------------------------------------------------
# AC-3: factory rejects missing build flag.
def test_ac3_factory_rejects_missing_build_flag(
monkeypatch, strategy_module_cleanup, caplog
) -> None:
strategy = "inlier_count"
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.delenv(flag, raising=False)
config = _config_with_strategy(strategy)
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_5_rerank"):
with pytest.raises(StrategyNotAvailableError) as exc_info:
build_rerank_strategy(
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
assert "BUILD_RERANK_INLIER_COUNT is OFF" in str(exc_info.value)
assert any(
r.message == "c2_5.rerank.build_flag_off" for r in caplog.records
)
def test_ac3_factory_does_not_load_module_when_flag_off(
monkeypatch, strategy_module_cleanup
) -> None:
module_name, _, flag = _STRATEGY_MODULES["inlier_count"]
monkeypatch.delenv(flag, raising=False)
config = _config_with_strategy("inlier_count")
with pytest.raises(StrategyNotAvailableError):
build_rerank_strategy(
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
assert module_name not in sys.modules
# ----------------------------------------------------------------------
# AC-4: unknown strategy rejected at config-load time.
@pytest.mark.parametrize(
"bad_label",
["INLIER_COUNT", "garbage", "", "learned_reranker"],
)
def test_ac4_unknown_strategy_rejected_at_config_load(bad_label: str) -> None:
with pytest.raises(ConfigError) as exc_info:
C2_5RerankConfig(strategy=bad_label)
msg = str(exc_info.value)
for valid in KNOWN_STRATEGIES:
assert valid in msg
# ----------------------------------------------------------------------
# AC-5: factory emits INFO log on success.
def test_ac5_factory_emits_info_log_on_success(
monkeypatch, strategy_module_cleanup, caplog
) -> None:
strategy = "inlier_count"
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
_install_fake_strategy(strategy)
config = _config_with_strategy(strategy)
with caplog.at_level(logging.INFO, logger="gps_denied_onboard.c2_5_rerank"):
instance = build_rerank_strategy(
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
assert isinstance(instance, ReRankStrategy)
records = [
r for r in caplog.records if r.message == "c2_5.rerank.strategy_loaded"
]
assert len(records) == 1
record = records[0]
assert getattr(record, "strategy", None) == "inlier_count"
assert getattr(record, "top_n", None) == 3
# ----------------------------------------------------------------------
# AC-6: strategy resolution table.
def test_ac6_strategy_resolution(monkeypatch, strategy_module_cleanup) -> None:
strategy = "inlier_count"
module_name, class_name, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
fake_cls = _install_fake_strategy(strategy)
config = _config_with_strategy(strategy)
instance = build_rerank_strategy(
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
assert isinstance(instance, fake_cls)
assert isinstance(instance, ReRankStrategy)
assert sys.modules[module_name] is not None
# ----------------------------------------------------------------------
# AC-7: error hierarchy.
@pytest.mark.parametrize(
"exc_factory",
[RerankBackboneError, RerankAllCandidatesFailedError],
)
def test_ac7_all_rerank_errors_caught_as_family(exc_factory) -> None:
with pytest.raises(RerankError):
raise exc_factory("boom")
def test_ac7_strategy_not_available_outside_family() -> None:
with pytest.raises(StrategyNotAvailableError):
try:
raise StrategyNotAvailableError("composition-time")
except RerankError:
pytest.fail(
"StrategyNotAvailableError is a composition-root error "
"and MUST NOT be in the c2.5 RerankError family"
)
# ----------------------------------------------------------------------
# AC-8: Public API re-exports.
def test_ac8_public_api_re_exports() -> None:
from gps_denied_onboard.components import c2_5_rerank
assert "ReRankStrategy" in c2_5_rerank.__all__
assert "RerankResult" in c2_5_rerank.__all__
assert "RerankCandidate" in c2_5_rerank.__all__
def test_ac8_internals_not_in_public_api() -> None:
from gps_denied_onboard.components import c2_5_rerank
# Concrete strategy must not leak into the package re-exports;
# consumers see only the Protocol.
assert "InlierCountReRanker" not in c2_5_rerank.__all__
# ----------------------------------------------------------------------
# AC-11: tile_pixels_handle opaqueness.
def test_ac11_tile_pixels_handle_opaque() -> None:
handle = object()
candidate = RerankCandidate(
tile_id=(18, 49.9, 36.3),
inlier_count=10,
descriptor_distance=0.5,
descriptor_dim=256,
tile_pixels_handle=handle,
)
assert candidate.tile_pixels_handle is handle
# ----------------------------------------------------------------------
# NFRs.
@pytest.mark.parametrize(
"exc_type",
[RerankBackboneError, RerankAllCandidatesFailedError],
)
def test_nfr_reliability_all_rerank_errors_subclass_family(exc_type) -> None:
assert issubclass(exc_type, RerankError)
def test_nfr_reliability_strategy_not_available_not_in_family() -> None:
assert not issubclass(StrategyNotAvailableError, RerankError)
def test_nfr_perf_factory_under_50ms_p99(
monkeypatch, strategy_module_cleanup
) -> None:
strategy = "inlier_count"
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
_install_fake_strategy(strategy)
config = _config_with_strategy(strategy)
tile_store = _FakeTileStore()
lightglue_runtime = _FakeLightGlueRuntime()
feature_extractor = _FakeFeatureExtractor()
clock = _FakeClock()
durations_ms: list[float] = []
for _ in range(100):
t0 = time.perf_counter()
build_rerank_strategy(
config,
tile_store=tile_store,
lightglue_runtime=lightglue_runtime,
feature_extractor=feature_extractor,
clock=clock,
)
durations_ms.append((time.perf_counter() - t0) * 1000.0)
durations_ms.sort()
p99 = durations_ms[int(0.99 * len(durations_ms))]
assert p99 <= 50.0
# ----------------------------------------------------------------------
# Surface coverage — config defaults.
def test_c2_5_config_defaults() -> None:
cfg = C2_5RerankConfig()
assert cfg.strategy == "inlier_count"
assert cfg.top_n == 3
def test_c2_5_config_top_n_validation() -> None:
with pytest.raises(ConfigError):
C2_5RerankConfig(top_n=0)
with pytest.raises(ConfigError):
C2_5RerankConfig(top_n=-3)
-9
View File
@@ -1,9 +0,0 @@
"""C2.5 Rerank smoke test — AC-9."""
def test_interface_importable() -> None:
# Assert
from gps_denied_onboard.components.c2_5_rerank import RerankResult, RerankStrategy
assert RerankStrategy is not None
assert RerankResult is not None
@@ -0,0 +1,528 @@
"""AZ-336 — C2 VprStrategy Protocol + DTO + error + factory conformance.
Covers all 9 ACs of AZ-336 + the NFRs. The factory ACs (AC-3..AC-6)
substitute fake strategy modules at the ``sys.modules`` boundary so
the test never touches UltraVPR / NetVLAD / FAISS / TensorRT native
libraries.
"""
from __future__ import annotations
import dataclasses
import logging
import sys
import time
import types
from pathlib import Path
import numpy as np
import pytest
from gps_denied_onboard._types.vpr import VprCandidate, VprQuery, VprResult
from gps_denied_onboard.components.c2_vpr import (
C2VprConfig,
IndexUnavailableError,
VprBackboneError,
VprError,
VprPreprocessError,
VprStrategy,
)
from gps_denied_onboard.components.c2_vpr._preprocessor import BackbonePreprocessor
from gps_denied_onboard.components.c2_vpr.config import KNOWN_STRATEGIES
from gps_denied_onboard.config.schema import Config, ConfigError
from gps_denied_onboard.runtime_root.errors import StrategyNotAvailableError
from gps_denied_onboard.runtime_root.vpr_factory import build_vpr_strategy
_STRATEGY_MODULES: dict[str, tuple[str, str, str]] = {
"ultra_vpr": (
"gps_denied_onboard.components.c2_vpr.ultra_vpr",
"UltraVprStrategy",
"BUILD_VPR_ULTRA_VPR",
),
"net_vlad": (
"gps_denied_onboard.components.c2_vpr.net_vlad",
"NetVladStrategy",
"BUILD_VPR_NETVLAD",
),
"mega_loc": (
"gps_denied_onboard.components.c2_vpr.mega_loc",
"MegaLocStrategy",
"BUILD_VPR_MEGALOC",
),
"mix_vpr": (
"gps_denied_onboard.components.c2_vpr.mix_vpr",
"MixVprStrategy",
"BUILD_VPR_MIXVPR",
),
"sela_vpr": (
"gps_denied_onboard.components.c2_vpr.sela_vpr",
"SelaVprStrategy",
"BUILD_VPR_SELAVPR",
),
"eigen_places": (
"gps_denied_onboard.components.c2_vpr.eigen_places",
"EigenPlacesStrategy",
"BUILD_VPR_EIGENPLACES",
),
"salad": (
"gps_denied_onboard.components.c2_vpr.salad",
"SaladStrategy",
"BUILD_VPR_SALAD",
),
}
# ----------------------------------------------------------------------
# Fakes that structurally satisfy the VprStrategy + DescriptorIndex
# Protocols. Tests substitute these at the sys.modules boundary so no
# native library is loaded.
class _FakeDescriptorIndex:
def __init__(self, dim: int = 512) -> None:
self._dim = dim
def search_topk(self, query, k):
return []
def descriptor_dim(self):
return self._dim
def mmap_handle(self):
return Path("/tmp/fake.faiss")
def rebuild_from_descriptors(self, descriptors, tile_ids, hnsw_params):
return None
def index_metadata(self):
raise NotImplementedError
class _FakeInferenceRuntime:
def load_engine(self, *args, **kwargs):
raise NotImplementedError
def infer(self, *args, **kwargs):
raise NotImplementedError
def warm_up(self):
return None
def thermal_state(self):
raise NotImplementedError
class _FullVprStrategy:
def __init__(self, config, *, descriptor_index, inference_runtime, dim=512) -> None:
self._config = config
self._descriptor_index = descriptor_index
self._inference_runtime = inference_runtime
self._dim = dim
self._label = config.components["c2_vpr"].strategy
def embed_query(self, frame, calibration):
return VprQuery(
frame_id=getattr(frame, "frame_id", 0),
embedding=np.zeros((self._dim,), dtype=np.float32),
produced_at=1_000_000_000,
)
def retrieve_topk(self, query, k):
return VprResult(
frame_id=query.frame_id,
candidates=tuple(),
retrieved_at=1_000_000_000,
backbone_label=self._label,
)
def descriptor_dim(self):
return self._dim
class _PartialVprStrategy:
def embed_query(self, frame, calibration):
raise NotImplementedError
def retrieve_topk(self, query, k):
raise NotImplementedError
def _config_with_strategy(strategy: str) -> Config:
return Config.with_blocks(c2_vpr=C2VprConfig(strategy=strategy))
def _install_fake_strategy(strategy_label: str, *, dim: int = 512) -> type:
module_name, class_name, _flag = _STRATEGY_MODULES[strategy_label]
class _FakeStrategy(_FullVprStrategy):
def __init__(self, config, **kwargs) -> None:
super().__init__(config, dim=dim, **kwargs)
_FakeStrategy.__name__ = class_name
module = types.ModuleType(module_name)
setattr(module, class_name, _FakeStrategy)
sys.modules[module_name] = module
return _FakeStrategy
@pytest.fixture
def strategy_module_cleanup():
"""Pop every fake strategy module before/after each factory test."""
for module_name, _, _ in _STRATEGY_MODULES.values():
sys.modules.pop(module_name, None)
yield
for module_name, _, _ in _STRATEGY_MODULES.values():
sys.modules.pop(module_name, None)
# ----------------------------------------------------------------------
# AC-1: Protocol conformance — full satisfies, partial does not.
def test_ac1_vpr_strategy_conformance_full() -> None:
instance = _FullVprStrategy(
_config_with_strategy("net_vlad"),
descriptor_index=_FakeDescriptorIndex(),
inference_runtime=_FakeInferenceRuntime(),
)
assert isinstance(instance, VprStrategy)
def test_ac1_vpr_strategy_conformance_partial_missing_methods() -> None:
assert not isinstance(_PartialVprStrategy(), VprStrategy)
# ----------------------------------------------------------------------
# AC-2: frozen+slotted DTOs reject mutation and forbid __dict__.
def _make_query(frame_id: int = 7, dim: int = 512) -> VprQuery:
return VprQuery(
frame_id=frame_id,
embedding=np.zeros((dim,), dtype=np.float32),
produced_at=1_000_000_000,
)
def _make_candidate() -> VprCandidate:
return VprCandidate(
tile_id=(18, 49.9, 36.3),
descriptor_distance=0.123,
descriptor_dim=512,
)
def _make_result(frame_id: int = 7) -> VprResult:
return VprResult(
frame_id=frame_id,
candidates=(_make_candidate(),),
retrieved_at=1_000_000_000,
backbone_label="net_vlad",
)
@pytest.mark.parametrize(
"dto, field_name, new_value",
[
(_make_query(), "frame_id", 99),
(_make_candidate(), "descriptor_distance", 0.9),
(_make_result(), "backbone_label", "ultra_vpr"),
],
)
def test_ac2_frozen_dtos_reject_mutation(dto, field_name: str, new_value) -> None:
original_value = getattr(dto, field_name)
with pytest.raises(dataclasses.FrozenInstanceError):
setattr(dto, field_name, new_value)
assert getattr(dto, field_name) == original_value
@pytest.mark.parametrize("cls", [VprQuery, VprCandidate, VprResult])
def test_ac2_dtos_have_slots(cls) -> None:
assert hasattr(cls, "__slots__"), f"{cls.__name__} must use slots=True"
assert cls.__slots__, f"{cls.__name__}.__slots__ must be non-empty"
instance = (
_make_query()
if cls is VprQuery
else _make_candidate()
if cls is VprCandidate
else _make_result()
)
assert not hasattr(instance, "__dict__"), (
f"{cls.__name__} carries a __dict__ — slots=True is missing"
)
# ----------------------------------------------------------------------
# AC-3: factory rejects missing build flag.
def test_ac3_factory_rejects_missing_build_flag(
monkeypatch, strategy_module_cleanup, caplog
) -> None:
strategy = "ultra_vpr"
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.delenv(flag, raising=False)
config = _config_with_strategy(strategy)
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_vpr"):
with pytest.raises(StrategyNotAvailableError) as exc_info:
build_vpr_strategy(
config,
descriptor_index=_FakeDescriptorIndex(),
inference_runtime=_FakeInferenceRuntime(),
)
assert "BUILD_VPR_ULTRA_VPR is OFF" in str(exc_info.value)
assert any(
r.message == "c2.vpr.build_flag_off"
for r in caplog.records
), "ERROR log kind=c2.vpr.build_flag_off must be emitted"
@pytest.mark.parametrize("strategy", sorted(_STRATEGY_MODULES))
def test_ac3_factory_does_not_load_module_when_flag_off(
monkeypatch, strategy_module_cleanup, strategy
) -> None:
module_name, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.delenv(flag, raising=False)
config = _config_with_strategy(strategy)
with pytest.raises(StrategyNotAvailableError):
build_vpr_strategy(
config,
descriptor_index=_FakeDescriptorIndex(),
inference_runtime=_FakeInferenceRuntime(),
)
assert module_name not in sys.modules, (
f"{module_name} must NOT be in sys.modules when its BUILD flag is OFF"
)
# ----------------------------------------------------------------------
# AC-4: factory rejects descriptor_dim mismatch.
def test_ac4_factory_rejects_dim_mismatch(
monkeypatch, strategy_module_cleanup, caplog
) -> None:
strategy = "ultra_vpr"
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
_install_fake_strategy(strategy, dim=512)
config = _config_with_strategy(strategy)
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_vpr"):
with pytest.raises(ConfigError) as exc_info:
build_vpr_strategy(
config,
descriptor_index=_FakeDescriptorIndex(dim=4096),
inference_runtime=_FakeInferenceRuntime(),
)
assert "descriptor_dim mismatch: strategy=512, corpus=4096" in str(
exc_info.value
)
assert any(
r.message == "c2.vpr.dim_mismatch"
for r in caplog.records
), "ERROR log kind=c2.vpr.dim_mismatch must be emitted"
# ----------------------------------------------------------------------
# AC-5: successful factory load emits INFO log with structured fields.
def test_ac5_factory_emits_info_log_on_success(
monkeypatch, strategy_module_cleanup, caplog
) -> None:
strategy = "ultra_vpr"
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
_install_fake_strategy(strategy, dim=512)
config = _config_with_strategy(strategy)
with caplog.at_level(logging.INFO, logger="gps_denied_onboard.c2_vpr"):
instance = build_vpr_strategy(
config,
descriptor_index=_FakeDescriptorIndex(dim=512),
inference_runtime=_FakeInferenceRuntime(),
)
assert isinstance(instance, VprStrategy)
records = [
r for r in caplog.records if r.message == "c2.vpr.strategy_loaded"
]
assert len(records) == 1, "Exactly one strategy_loaded INFO log expected"
record = records[0]
assert getattr(record, "strategy", None) == "ultra_vpr"
assert getattr(record, "descriptor_dim", None) == 512
# ----------------------------------------------------------------------
# AC-6: every entry in the resolution table resolves to its module path.
@pytest.mark.parametrize("strategy", sorted(_STRATEGY_MODULES))
def test_ac6_strategy_resolution_table(
monkeypatch, strategy_module_cleanup, strategy
) -> None:
module_name, class_name, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
fake_cls = _install_fake_strategy(strategy, dim=512)
config = _config_with_strategy(strategy)
instance = build_vpr_strategy(
config,
descriptor_index=_FakeDescriptorIndex(dim=512),
inference_runtime=_FakeInferenceRuntime(),
)
assert isinstance(instance, fake_cls)
assert isinstance(instance, VprStrategy)
assert sys.modules[module_name] is not None
# ----------------------------------------------------------------------
# AC-7: error hierarchy — every concrete error is catchable as VprError.
@pytest.mark.parametrize(
"exc_factory",
[VprBackboneError, VprPreprocessError, IndexUnavailableError],
)
def test_ac7_all_vpr_errors_caught_as_family(exc_factory) -> None:
with pytest.raises(VprError):
raise exc_factory("boom")
def test_ac7_unrelated_exception_not_caught_as_family() -> None:
with pytest.raises(ValueError):
try:
raise ValueError("not us")
except VprError:
pytest.fail("ValueError must not be caught as VprError")
def test_ac7_strategy_not_available_outside_family() -> None:
with pytest.raises(StrategyNotAvailableError):
try:
raise StrategyNotAvailableError("composition-time")
except VprError:
pytest.fail(
"StrategyNotAvailableError is a composition-root error "
"and MUST NOT be in the c2 VprError family"
)
# ----------------------------------------------------------------------
# AC-8: Public API surface — re-exports + BackbonePreprocessor exclusion.
def test_ac8_public_api_re_exports() -> None:
from gps_denied_onboard.components import c2_vpr
assert "VprStrategy" in c2_vpr.__all__
assert "VprQuery" in c2_vpr.__all__
assert "VprCandidate" in c2_vpr.__all__
assert "VprResult" in c2_vpr.__all__
def test_ac8_backbone_preprocessor_not_in_public_api() -> None:
from gps_denied_onboard.components import c2_vpr
assert "BackbonePreprocessor" not in c2_vpr.__all__
assert not hasattr(c2_vpr, "BackbonePreprocessor"), (
"BackbonePreprocessor is C2-internal per description.md § 6 and "
"MUST NOT be re-exported from c2_vpr/__init__.py"
)
def test_ac8_backbone_preprocessor_protocol_is_runtime_checkable() -> None:
# BackbonePreprocessor is internal but still a Protocol; tests in
# AZ-337..AZ-340 will use isinstance against it.
class _OkPreprocessor:
def preprocess(self, frame, calibration):
raise NotImplementedError
def input_shape(self):
return (224, 224)
assert isinstance(_OkPreprocessor(), BackbonePreprocessor)
# ----------------------------------------------------------------------
# Config validation — unknown strategy label is rejected at load.
# (AC-9 single-thread binding is deferred per AZ-336 task spec Risk 4;
# the generic compose_root thread-binding registry referenced by AC-9
# has not materialised — each factory owns its own thread binding
# today, e.g. ``runtime_root.fc_factory.clear_outbound_thread_binding``.
# AC-9 ships with AZ-270's registry or its replacement; this task
# delivers AC-1..AC-8 + NFRs in line with the spec's escape clause.)
@pytest.mark.parametrize(
"bad_label",
["ULTRA_VPR", "ultraVpr", "openVLAD", "", "vins_mono"],
)
def test_unknown_strategy_rejected_at_config_load(bad_label: str) -> None:
with pytest.raises(ConfigError) as exc_info:
C2VprConfig(strategy=bad_label)
msg = str(exc_info.value)
for valid in KNOWN_STRATEGIES:
assert valid in msg
# ----------------------------------------------------------------------
# NFRs.
@pytest.mark.parametrize(
"exc_type",
[VprBackboneError, VprPreprocessError, IndexUnavailableError],
)
def test_nfr_reliability_all_vpr_errors_subclass_family(exc_type) -> None:
assert issubclass(exc_type, VprError)
def test_nfr_reliability_strategy_not_available_not_in_family() -> None:
assert not issubclass(StrategyNotAvailableError, VprError)
def test_nfr_perf_factory_under_50ms_p99(
monkeypatch, strategy_module_cleanup
) -> None:
"""Factory p99 ≤ 50 ms across 100 calls (NFR-perf-factory)."""
strategy = "net_vlad"
_, _, flag = _STRATEGY_MODULES[strategy]
monkeypatch.setenv(flag, "ON")
_install_fake_strategy(strategy, dim=512)
config = _config_with_strategy(strategy)
descriptor_index = _FakeDescriptorIndex(dim=512)
inference_runtime = _FakeInferenceRuntime()
durations_ms: list[float] = []
for _ in range(100):
t0 = time.perf_counter()
build_vpr_strategy(
config,
descriptor_index=descriptor_index,
inference_runtime=inference_runtime,
)
durations_ms.append((time.perf_counter() - t0) * 1000.0)
durations_ms.sort()
p99 = durations_ms[int(0.99 * len(durations_ms))]
assert p99 <= 50.0, (
f"build_vpr_strategy() p99={p99:.3f} ms exceeds 50 ms NFR"
)
# ----------------------------------------------------------------------
# Surface coverage — config defaults round-trip.
def test_c2_config_default_strategy_is_net_vlad() -> None:
cfg = C2VprConfig()
assert cfg.strategy == "net_vlad"
def test_c2_config_paths_coerce_to_path() -> None:
cfg = C2VprConfig(
backbone_weights_path="/tmp/weights", # type: ignore[arg-type]
faiss_index_path="/tmp/index.faiss", # type: ignore[arg-type]
)
assert isinstance(cfg.backbone_weights_path, Path)
assert isinstance(cfg.faiss_index_path, Path)

Some files were not shown because too many files have changed in this diff Show More