[AZ-345] [AZ-346] [AZ-347] [AZ-349] C3 matchers + C3.5 AdHoP refiner

Implement the three concrete C3 CrossDomainMatcher strategies plus the
C3.5 production-default AdHoPRefiner.

C3 (AZ-345/346/347):
- DiskLightGlueMatcher + AlikedLightGlueMatcher share a single shared
  _pipeline.run_lightglue_pipeline orchestrator (decode -> query
  extract -> per-candidate loop -> RANSAC sort -> health update ->
  FDR emit) so the only per-backbone delta is the keypoint+descriptor
  extractor closure. ALIKED adds a create-time engine output-schema
  probe (AC-special-1).
- XFeatMatcher owns its own per-candidate loop (single forward fuses
  extraction + matching); it re-uses the shared FDR emission helpers
  to keep telemetry byte-identical across strategies. lightglue_runtime
  parameter accepted by factory but discarded (AC-special-1).
- All three consume the shared LightGlueRuntime / RansacFilter /
  RollingHealthWindow helpers; no helper forks. InferenceRuntimeCut
  consumer-side Protocol added per AZ-507.

C3.5 (AZ-349):
- AdHoPRefiner implements the <= conditional gate, runs the OrthoLoC
  AdHoP TRT engine over best-candidate correspondences, re-runs RANSAC
  on the perspective-preconditioned set, and emits an enriched
  MatchResult with refinement_label="adhop".
- Invariant 4 passthrough fall-through: any RefinerBackboneError (TRT
  failure, OOM, NaN, bad shape) is caught, logged ERROR, FDR-emitted
  with error: true, and converted to passthrough that still counts
  against the rolling invocation-rate window. MemoryError and other
  non-listed exceptions propagate by design (AC-5 closed-set
  semantics).
- Rolling 60-s invocation-rate window + rate-limited WARN log
  (configurable via ratelimited_warn_window_ns; default 60 s).

Shared changes:
- C3MatcherConfig + C3_5RefinerConfig extended with the new
  weights/threshold/window fields.
- matcher_factory + refiner_factory optionally forward clock +
  fdr_client to the strategy's create(); backward-compatible.
- fdr_client.records registers five new kinds: matcher.frame_done,
  matcher.backbone_error, matcher.insufficient_inliers,
  matcher.all_failed, refiner.frame_done.

Tests: 66 new (43 C3 parametrised + 23 AdHoP) covering 47/47 ACs;
focused suite green; full project test suite green except for one
pre-existing flaky CLI cold-start timing test unrelated to this batch.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-14 04:09:22 +03:00
parent 06f655d8fb
commit a1185d0a28
19 changed files with 4855 additions and 6 deletions
@@ -0,0 +1,509 @@
"""``AdHoPRefiner`` — C3.5 production-default ConditionalRefiner (AZ-349).
Implements the conditional gate documented in
``conditional_refiner_protocol.md`` v1.0.0:
- ``mr.reprojection_residual_px <= residual_threshold_px`` → passthrough
(return ``mr`` unchanged, ``was_invoked() == False``).
- Otherwise → invoke the OrthoLoC AdHoP TRT engine to perspective-
precondition the BEST candidate's correspondences, re-RANSAC, and
emit a new :class:`MatchResult` via :func:`dataclasses.replace`
with ``refinement_label="adhop"``.
INV-4 (passthrough fall-through): :class:`RefinerBackboneError`
raised inside the invoked path is caught, logged at ERROR, emitted
to FDR with ``error: true``, and converted to passthrough — the
input ``mr`` is returned unchanged with ``was_invoked() == True``
(the attempt still counts against the rolling invocation rate). Any
other exception type (e.g. ``MemoryError``) re-propagates by design
— Invariant 4's contract is intentionally closed-set per AC-5.
A 60 s rolling window tracks per-frame ``was_invoked`` outcomes for
the WARN log when the invocation rate exceeds
``config.refiner.invocation_rate_warn_threshold``; the warning is
rate-limited to one record per
``config.refiner.ratelimited_warn_window_ns`` (default 60 s) so a
misbehaving threshold cannot flood the log pipeline.
Layering:
- Imports only L1 (``_types``, ``helpers``, ``fdr_client``, ``clock``,
``config``) and sibling L3 modules (``interface``, ``errors``,
``config``, ``inference_runtime_cut``).
- The C7 inference runtime is accepted through the consumer-side
:class:`InferenceRuntimeCut` Protocol (AZ-507).
"""
from __future__ import annotations
import dataclasses
import logging
from collections import deque
from typing import TYPE_CHECKING, Final
import numpy as np
from gps_denied_onboard._types.inference import BuildConfig, PrecisionMode
from gps_denied_onboard._types.matcher import CandidateMatchSet, MatchResult
from gps_denied_onboard.components.c3_5_adhop.errors import (
RefinerBackboneError,
RefinerConfigError,
)
from gps_denied_onboard.components.c3_5_adhop.inference_runtime_cut import (
InferenceRuntimeCut,
)
from gps_denied_onboard.fdr_client import EnqueueResult, FdrRecord
from gps_denied_onboard.fdr_client.records import CURRENT_SCHEMA_VERSION
from gps_denied_onboard.helpers.iso_timestamps import iso_ts_from_clock
from gps_denied_onboard.helpers.ransac_filter import RansacFilterError
if TYPE_CHECKING:
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrClient
from gps_denied_onboard.helpers.ransac_filter import RansacFilter
__all__ = ["MODEL_NAME", "AdHoPRefiner", "create"]
MODEL_NAME: Final[str] = "adhop"
_REFINEMENT_LABEL_ADHOP: Final[str] = "adhop"
_REFINEMENT_LABEL_PASSTHROUGH: Final[str] = "passthrough"
_FDR_PRODUCER_ID: Final[str] = "c3_5_adhop.adhop_refiner"
_FDR_KIND_FRAME_DONE: Final[str] = "refiner.frame_done"
_LOG_KIND_READY: Final[str] = "c3_5.refiner.ready"
_LOG_KIND_BACKBONE_ERROR: Final[str] = "c3_5.refiner.backbone_error"
_LOG_KIND_INVOCATION_RATE: Final[str] = "c3_5.refiner.invocation_rate_high"
_LOG_KIND_FDR_OVERRUN: Final[str] = "c3_5.refiner.fdr_overrun"
# Rolling window for invocation-rate accounting.
_ROLLING_WINDOW_NS: Final[int] = 60 * 1_000_000_000
# AdHoP engine I/O contract: takes the raw correspondences and
# returns the perspective-preconditioned set. Real upstream contract
# is method-agnostic per OrthoLoC; the runtime ingests them as
# ``correspondences`` in / out.
_INPUT_CORRESPONDENCES: Final[str] = "correspondences"
_OUTPUT_CORRESPONDENCES: Final[str] = "correspondences"
class AdHoPRefiner:
"""OrthoLoC AdHoP refiner.
Single-threaded by contract (Invariant 1); the rolling window
and ``_was_invoked`` flag are non-thread-safe by design.
"""
def __init__(
self,
*,
inference_runtime: InferenceRuntimeCut,
engine_handle: object,
ransac_filter: type["RansacFilter"],
invocation_rate_warn_threshold: float,
ratelimited_warn_window_ns: int,
ransac_threshold_px: float,
min_inliers_threshold: int,
clock: "Clock",
fdr_client: "FdrClient | None",
logger: logging.Logger,
) -> None:
if invocation_rate_warn_threshold <= 0.0 or invocation_rate_warn_threshold >= 1.0:
raise RefinerConfigError(
"AdHoPRefiner.invocation_rate_warn_threshold must be in (0, 1); "
f"got {invocation_rate_warn_threshold}"
)
if ratelimited_warn_window_ns < 1:
raise RefinerConfigError(
"AdHoPRefiner.ratelimited_warn_window_ns must be >= 1; "
f"got {ratelimited_warn_window_ns}"
)
if ransac_threshold_px <= 0.0:
raise RefinerConfigError(
"AdHoPRefiner.ransac_threshold_px must be > 0; "
f"got {ransac_threshold_px}"
)
if min_inliers_threshold < 1:
raise RefinerConfigError(
"AdHoPRefiner.min_inliers_threshold must be >= 1; "
f"got {min_inliers_threshold}"
)
self._inference_runtime = inference_runtime
self._engine_handle = engine_handle
self._ransac_filter = ransac_filter
self._invocation_rate_warn_threshold = float(invocation_rate_warn_threshold)
self._ratelimited_warn_window_ns = int(ratelimited_warn_window_ns)
self._ransac_threshold_px = float(ransac_threshold_px)
self._min_inliers_threshold = int(min_inliers_threshold)
self._clock = clock
self._fdr_client = fdr_client
self._logger = logger
self._was_invoked: bool = False
# Window entries: (timestamp_ns, was_invoked)
self._invocation_window: deque[tuple[int, bool]] = deque()
self._last_warn_ns: int = -1
def refine_if_needed(
self,
frame: "NavCameraFrame",
mr: MatchResult,
residual_threshold_px: float,
) -> MatchResult:
if residual_threshold_px <= 0.0:
raise ValueError(
f"residual_threshold_px must be > 0; got {residual_threshold_px}"
)
start_ns = int(self._clock.monotonic_ns())
# Gate
if mr.reprojection_residual_px <= residual_threshold_px:
self._was_invoked = False
self._record_invocation(now_ns=start_ns, invoked=False)
self._emit_frame_done(
frame_id=int(mr.frame_id),
invoked=False,
refinement_label=_REFINEMENT_LABEL_PASSTHROUGH,
added_latency_ms=0.0,
pre_residual=float(mr.reprojection_residual_px),
post_residual=float(mr.reprojection_residual_px),
inlier_count_before=int(
mr.per_candidate[mr.best_candidate_idx].inlier_count
),
inlier_count_after=int(
mr.per_candidate[mr.best_candidate_idx].inlier_count
),
error=False,
)
return mr
# Invoked path
self._was_invoked = True
self._record_invocation(now_ns=start_ns, invoked=True)
self._maybe_warn_invocation_rate(
now_ns=start_ns, frame_id=int(mr.frame_id)
)
best_idx = mr.best_candidate_idx
best = mr.per_candidate[best_idx]
try:
refined = self._run_adhop(best.inlier_correspondences)
new_set = self._rerunsac(best=best, refined=refined)
except RefinerBackboneError as exc:
elapsed_ms = self._elapsed_ms(start_ns)
self._log_backbone_error(frame_id=int(mr.frame_id), error=exc)
self._emit_frame_done(
frame_id=int(mr.frame_id),
invoked=True,
refinement_label=_REFINEMENT_LABEL_PASSTHROUGH,
added_latency_ms=elapsed_ms,
pre_residual=float(mr.reprojection_residual_px),
post_residual=float(mr.reprojection_residual_px),
inlier_count_before=int(best.inlier_count),
inlier_count_after=int(best.inlier_count),
error=True,
)
return mr
elapsed_ms = self._elapsed_ms(start_ns)
# Replace ONLY the best candidate; other candidates remain
# unchanged so downstream consumers that still index into
# ``per_candidate`` see consistent residual ordering.
new_per_candidate = list(mr.per_candidate)
new_per_candidate[best_idx] = new_set
new_mr = dataclasses.replace(
mr,
per_candidate=tuple(new_per_candidate),
reprojection_residual_px=float(new_set.per_candidate_residual_px),
refinement_label=_REFINEMENT_LABEL_ADHOP,
refinement_added_latency_ms=elapsed_ms,
)
self._emit_frame_done(
frame_id=int(mr.frame_id),
invoked=True,
refinement_label=_REFINEMENT_LABEL_ADHOP,
added_latency_ms=elapsed_ms,
pre_residual=float(mr.reprojection_residual_px),
post_residual=float(new_set.per_candidate_residual_px),
inlier_count_before=int(best.inlier_count),
inlier_count_after=int(new_set.inlier_count),
error=False,
)
return new_mr
def was_invoked(self) -> bool:
return self._was_invoked
def _run_adhop(self, correspondences: np.ndarray) -> np.ndarray:
"""Run the AdHoP TRT engine over the given correspondences.
Translates any backbone failure (TRT error, OOM, NaN, shape
mismatch) into :class:`RefinerBackboneError`. Other
exception types propagate (AC-5).
"""
if correspondences.ndim != 2 or correspondences.shape[1] != 4:
raise RefinerBackboneError(
f"AdHoP input correspondences must have shape (N, 4); "
f"got {correspondences.shape}"
)
try:
outputs = self._inference_runtime.infer(
self._engine_handle,
{_INPUT_CORRESPONDENCES: correspondences.astype(np.float32, copy=False)},
)
except (RuntimeError, ValueError) as exc:
# Map runtime/value errors from the C7 runtime onto the
# public RefinerBackboneError so the passthrough
# fall-through (Invariant 4) is triggered. MemoryError /
# SystemExit / KeyboardInterrupt deliberately not caught
# — AC-5 closed-set semantics.
raise RefinerBackboneError(
f"AdHoP backbone forward raised {type(exc).__name__}: {exc}"
) from exc
if _OUTPUT_CORRESPONDENCES not in outputs:
raise RefinerBackboneError(
f"AdHoP backbone forward returned unexpected keys: "
f"{sorted(outputs.keys())!r}; expected "
f"{_OUTPUT_CORRESPONDENCES!r}"
)
refined = np.asarray(outputs[_OUTPUT_CORRESPONDENCES], dtype=np.float32)
if refined.ndim != 2 or refined.shape[1] != 4:
raise RefinerBackboneError(
f"AdHoP backbone returned shape {refined.shape}; "
"expected (M, 4)"
)
if refined.size > 0 and not np.isfinite(refined).all():
raise RefinerBackboneError(
"AdHoP backbone produced non-finite (NaN/Inf) correspondences"
)
return refined
def _rerunsac(
self, *, best: CandidateMatchSet, refined: np.ndarray
) -> CandidateMatchSet:
if refined.shape[0] < 4:
raise RefinerBackboneError(
"AdHoP-preconditioned correspondences below RANSAC minimum "
f"(got {refined.shape[0]} pairs; need ≥4)"
)
try:
ransac_result = self._ransac_filter.filter_correspondences(
refined, self._ransac_threshold_px, self._min_inliers_threshold
)
except RansacFilterError as exc:
raise RefinerBackboneError(
f"AdHoP post-RANSAC failed: {exc}"
) from exc
if ransac_result.inlier_count == 0:
raise RefinerBackboneError(
"AdHoP post-RANSAC produced zero inliers"
)
return CandidateMatchSet(
tile_id=best.tile_id,
inlier_count=int(ransac_result.inlier_count),
inlier_correspondences=np.ascontiguousarray(
ransac_result.inlier_correspondences, dtype=np.float32
),
ransac_outlier_count=int(ransac_result.outlier_count),
per_candidate_residual_px=float(ransac_result.median_residual_px),
)
def _record_invocation(self, *, now_ns: int, invoked: bool) -> None:
cutoff = now_ns - _ROLLING_WINDOW_NS
window = self._invocation_window
while window and window[0][0] <= cutoff:
window.popleft()
window.append((now_ns, invoked))
def _invocation_rate(self) -> float:
if not self._invocation_window:
return 0.0
invoked = sum(1 for _ts, was_inv in self._invocation_window if was_inv)
return invoked / len(self._invocation_window)
def _maybe_warn_invocation_rate(self, *, now_ns: int, frame_id: int) -> None:
rate = self._invocation_rate()
if rate <= self._invocation_rate_warn_threshold:
return
if (
self._last_warn_ns >= 0
and now_ns - self._last_warn_ns < self._ratelimited_warn_window_ns
):
return
self._last_warn_ns = now_ns
self._logger.warning(
_LOG_KIND_INVOCATION_RATE,
extra={
"kind": _LOG_KIND_INVOCATION_RATE,
"kv": {
"frame_id": frame_id,
"rate": float(rate),
"target_threshold": float(self._invocation_rate_warn_threshold),
},
},
)
def _elapsed_ms(self, start_ns: int) -> float:
return max(0.0, (int(self._clock.monotonic_ns()) - start_ns) / 1_000_000.0)
def _log_backbone_error(self, *, frame_id: int, error: BaseException) -> None:
self._logger.error(
_LOG_KIND_BACKBONE_ERROR,
extra={
"kind": _LOG_KIND_BACKBONE_ERROR,
"kv": {
"frame_id": frame_id,
"error_type": type(error).__name__,
"phase": "adhop_forward",
},
},
)
def _emit_frame_done(
self,
*,
frame_id: int,
invoked: bool,
refinement_label: str,
added_latency_ms: float,
pre_residual: float,
post_residual: float,
inlier_count_before: int,
inlier_count_after: int,
error: bool,
) -> None:
if self._fdr_client is None:
return
payload: dict[str, object] = {
"frame_id": int(frame_id),
"was_invoked": bool(invoked),
"refinement_label": refinement_label,
"refinement_added_latency_ms": float(added_latency_ms),
"pre_residual_px": float(pre_residual),
"post_residual_px": float(post_residual),
"inlier_count_before": int(inlier_count_before),
"inlier_count_after": int(inlier_count_after),
}
if error:
payload["error"] = True
record = FdrRecord(
schema_version=CURRENT_SCHEMA_VERSION,
ts=iso_ts_from_clock(self._clock),
producer_id=_FDR_PRODUCER_ID,
kind=_FDR_KIND_FRAME_DONE,
payload=payload,
)
try:
result = self._fdr_client.enqueue(record)
except Exception as exc:
self._logger.debug(
"c3_5.refiner.fdr_enqueue_failed",
extra={
"kind": "c3_5.refiner.fdr_enqueue_failed",
"kv": {"frame_id": frame_id, "error": repr(exc)},
},
)
return
if result == EnqueueResult.OVERRUN:
self._logger.warning(
_LOG_KIND_FDR_OVERRUN,
extra={
"kind": _LOG_KIND_FDR_OVERRUN,
"kv": {"frame_id": frame_id, "record_kind": record.kind},
},
)
def _build_adhop_build_config() -> BuildConfig:
return BuildConfig(
precision=PrecisionMode.FP16,
workspace_mb=512,
calibration_dataset=None,
optimization_profiles=(),
)
def create(
config: "Config",
*,
ransac_filter: type["RansacFilter"],
inference_runtime: InferenceRuntimeCut,
clock: "Clock | None" = None,
fdr_client: "FdrClient | None" = None,
logger: logging.Logger | None = None,
) -> "AdHoPRefiner":
"""Module-level factory consumed by :func:`build_refiner_strategy`.
Loads the AdHoP engine ONCE at composition time; subsequent
``refine_if_needed`` calls re-use the resolved handle. The
composition root supplies ``clock`` and ``fdr_client``; both are
optional so unit-tests can drive the refiner with fakes.
"""
block = config.components["c3_5_adhop"]
weights_path = block.adhop_weights_path
if weights_path is None:
raise RefinerConfigError(
"AdHoPRefiner.create: config.components['c3_5_adhop']"
".adhop_weights_path is None; the runtime root MUST populate "
"the AdHoP engine path before constructing this strategy."
)
if clock is None:
from gps_denied_onboard.clock.wall_clock import WallClock
clock = WallClock()
if logger is None:
logger = logging.getLogger("gps_denied_onboard.c3_5_adhop.adhop_refiner")
cache_entry = inference_runtime.compile_engine(
weights_path, _build_adhop_build_config()
)
entry_for_deserialize = type(cache_entry)(
engine_path=cache_entry.engine_path,
sha256_hex=cache_entry.sha256_hex,
sm=cache_entry.sm,
jp=cache_entry.jp,
trt=cache_entry.trt,
precision=cache_entry.precision,
extras={**cache_entry.extras, "model_name": MODEL_NAME},
)
engine_handle = inference_runtime.deserialize_engine(entry_for_deserialize)
# Derive min_inliers floor from the C3 block — same component
# boundary as the matcher's RANSAC. Defaults to 4 if the C3
# block is absent (defensive — tests bypassing C3 registration).
try:
c3_block = config.components["c3_matcher"]
min_inliers_threshold = int(c3_block.min_inliers_threshold)
except KeyError:
min_inliers_threshold = 4
logger.info(
_LOG_KIND_READY,
extra={
"kind": _LOG_KIND_READY,
"kv": {
"strategy": _REFINEMENT_LABEL_ADHOP,
"residual_threshold_px": float(block.residual_threshold_px),
"invocation_rate_warn_threshold": float(
block.invocation_rate_warn_threshold
),
"ransac_threshold_px": float(block.ransac_threshold_px),
},
},
)
return AdHoPRefiner(
inference_runtime=inference_runtime,
engine_handle=engine_handle,
ransac_filter=ransac_filter,
invocation_rate_warn_threshold=block.invocation_rate_warn_threshold,
ratelimited_warn_window_ns=block.ratelimited_warn_window_ns,
ransac_threshold_px=block.ransac_threshold_px,
min_inliers_threshold=min_inliers_threshold,
clock=clock,
fdr_client=fdr_client,
logger=logger,
)
@@ -20,11 +20,31 @@ R10 tunable from operator orchestrator).
``invocation_rate_warn_threshold`` is the rolling-60 s
invocation-rate ceiling above which a WARN log fires
(C3.5-IT-03 / NFT-PERF-01). Must be in ``(0, 1)``; default 0.25.
``adhop_weights_path`` points at the OrthoLoC AdHoP TRT engine
produced offline by AZ-321. The composition-root factory passes
it through to :func:`AdHoPRefiner.create`. ``None`` is allowed as
a placeholder for ``strategy="passthrough"`` (or in tests that
inject a pre-built handle); the AdHoP factory raises
:class:`RefinerConfigError` if the path is missing AND
``strategy="adhop"``.
``ransac_threshold_px`` is the RANSAC reprojection-threshold the
refiner passes to the shared :class:`RansacFilter` when
re-filtering the AdHoP-preconditioned correspondences. Default
3.0 px (matches C3 / C4); same helper instance is identity-shared
across C3 / C3.5 / C4 per ``ransac_filter.md``.
``ratelimited_warn_window_ns`` is the floor between consecutive
``c3_5.refiner.invocation_rate_high`` WARN logs. Default 60 s
(AC-8 — "one per 60 s"). Surfacing this as a knob keeps tests
deterministic without monkey-patching ``time.monotonic_ns``.
"""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
from typing import Final
from gps_denied_onboard.config.schema import ConfigError
@@ -37,6 +57,8 @@ __all__ = [
KNOWN_STRATEGIES: Final[frozenset[str]] = frozenset({"adhop", "passthrough"})
_DEFAULT_WARN_WINDOW_NS: Final[int] = 60 * 1_000_000_000
@dataclass(frozen=True)
class C3_5RefinerConfig:
@@ -45,6 +67,9 @@ class C3_5RefinerConfig:
strategy: str = "adhop"
residual_threshold_px: float = 2.5
invocation_rate_warn_threshold: float = 0.25
adhop_weights_path: Path | None = None
ransac_threshold_px: float = 3.0
ratelimited_warn_window_ns: int = _DEFAULT_WARN_WINDOW_NS
def __post_init__(self) -> None:
if self.strategy not in KNOWN_STRATEGIES:
@@ -62,3 +87,20 @@ class C3_5RefinerConfig:
"C3_5RefinerConfig.invocation_rate_warn_threshold must be in "
f"(0, 1); got {self.invocation_rate_warn_threshold}"
)
if self.adhop_weights_path is not None and not isinstance(
self.adhop_weights_path, Path
):
raise ConfigError(
"C3_5RefinerConfig.adhop_weights_path must be a pathlib.Path "
f"or None; got {type(self.adhop_weights_path).__name__}"
)
if self.ransac_threshold_px <= 0.0:
raise ConfigError(
"C3_5RefinerConfig.ransac_threshold_px must be > 0; "
f"got {self.ransac_threshold_px}"
)
if self.ratelimited_warn_window_ns < 1:
raise ConfigError(
"C3_5RefinerConfig.ratelimited_warn_window_ns must be >= 1; "
f"got {self.ratelimited_warn_window_ns}"
)
@@ -0,0 +1,65 @@
"""C3.5's structural cut of C7 ``InferenceRuntime`` (AZ-507).
:class:`AdHoPRefiner` (AZ-349) calls into C7's inference runtime to
load the OrthoLoC AdHoP TRT engine and run forward passes during
refinement. Per AZ-507, ``c3_5_adhop`` MUST NOT import
``components.c7_inference`` directly; the consumer-side cut declares
the structural Protocol surface that c3.5 actually uses, and the
composition root binds the c7 runtime as the concrete implementation.
This Protocol mirrors the subset of
:class:`gps_denied_onboard.components.c7_inference.InferenceRuntime`
that the C3.5 refiner consumes — ``compile_engine``,
``deserialize_engine``, ``infer``, ``release_engine``, and
``current_runtime_label``. The full Protocol (which adds
``thermal_state``) is wider; the cut narrows to what C3.5 needs.
Mirrors :mod:`gps_denied_onboard.components.c2_vpr.inference_runtime_cut`
and :mod:`gps_denied_onboard.components.c3_matcher.inference_runtime_cut`
1:1 — each consumer component owns its own structural cut for
single-responsibility, so a future divergence in one consumer does
not silently widen the others' contracts.
DTOs (``BuildConfig``, ``EngineHandle``, ``EngineCacheEntry``) live in
:mod:`gps_denied_onboard._types.inference` (L1) and are imported here
directly — they are L1 shared types, not cross-component imports.
"""
from __future__ import annotations
from pathlib import Path
from typing import TYPE_CHECKING, Literal, Protocol, runtime_checkable
from gps_denied_onboard._types.inference import (
BuildConfig,
EngineCacheEntry,
EngineHandle,
)
if TYPE_CHECKING:
import numpy as np
__all__ = ["InferenceRuntimeCut"]
@runtime_checkable
class InferenceRuntimeCut(Protocol):
"""Subset of C7 ``InferenceRuntime`` consumed by C3.5 refiners."""
def compile_engine(
self, model_path: Path, build_config: BuildConfig
) -> EngineCacheEntry: ...
def deserialize_engine(self, entry: EngineCacheEntry) -> EngineHandle: ...
def infer(
self,
handle: EngineHandle,
inputs: dict[str, "np.ndarray"],
) -> dict[str, "np.ndarray"]: ...
def release_engine(self, handle: EngineHandle) -> None: ...
def current_runtime_label(
self,
) -> Literal["tensorrt", "onnx_trt_ep", "pytorch_fp16"]: ...
@@ -0,0 +1,88 @@
"""C3-internal engine output-schema assertion helper (AZ-346 AC-special-1).
Single home for the dry-run probe that the ``aliked_lightglue``
strategy runs at :func:`create` time to verify that the loaded
inference engine emits ``keypoints`` + ``descriptors`` tensors with
the upstream-published shape contract. Catches the failure mode
where an operator swaps in a misconfigured TRT engine whose output
dimensionality differs from the published ALIKED contract — without
this probe the mismatch would only surface mid-flight as an
:class:`InsufficientInliersError` cascade.
Mirrors :func:`c2_vpr._engine_dim_assertion.assert_engine_output_dim`
in spirit (zero-init probe, ConfigError on mismatch) but works on
the dual-output keypoint+descriptor contract used by ALIKED and
(optionally) DISK rather than the single-tensor descriptor contract
used by C2 VPR backbones.
"""
from __future__ import annotations
from typing import TYPE_CHECKING
import numpy as np
from gps_denied_onboard.config.schema import ConfigError
if TYPE_CHECKING:
from gps_denied_onboard.components.c3_matcher.inference_runtime_cut import (
InferenceRuntimeCut,
)
__all__ = ["assert_keypoint_engine_output_schema"]
def assert_keypoint_engine_output_schema(
inference_runtime: "InferenceRuntimeCut",
handle: object,
*,
input_size: int,
input_key: str = "image",
keypoints_key: str = "keypoints",
descriptors_key: str = "descriptors",
expected_descriptor_dim: int | None = None,
) -> None:
"""Dry-run the engine and verify the output schema.
A single zero-init FP32 NCHW probe is dispatched at the
documented ``input_size``; the returned dict MUST carry the
``keypoints_key`` and ``descriptors_key`` tensors, both 2-D, both
consistent on the keypoint count axis. When
``expected_descriptor_dim`` is given, the descriptors column
count is also validated.
:raises ConfigError: any of the schema invariants fail; the
message names both the expected contract and the offending
actual shape so the operator can identify the misconfigured
engine.
"""
probe = np.zeros((1, 3, input_size, input_size), dtype=np.float32)
outputs = inference_runtime.infer(handle, {input_key: probe})
if keypoints_key not in outputs or descriptors_key not in outputs:
raise ConfigError(
f"engine output schema mismatch: expected keys "
f"{{{keypoints_key!r}, {descriptors_key!r}}}; got "
f"{sorted(outputs.keys())!r}"
)
keypoints = np.asarray(outputs[keypoints_key])
descriptors = np.asarray(outputs[descriptors_key])
if keypoints.ndim != 2 or keypoints.shape[-1] != 2:
raise ConfigError(
f"engine output schema mismatch: expected keypoints "
f"shape (N, 2); got {tuple(keypoints.shape)}"
)
if descriptors.ndim != 2 or descriptors.shape[0] != keypoints.shape[0]:
raise ConfigError(
f"engine output schema mismatch: descriptors shape "
f"{tuple(descriptors.shape)} inconsistent with keypoints "
f"shape {tuple(keypoints.shape)}"
)
if (
expected_descriptor_dim is not None
and descriptors.shape[1] != expected_descriptor_dim
):
raise ConfigError(
f"engine output schema mismatch: expected descriptors "
f"column count {expected_descriptor_dim}; got "
f"{descriptors.shape[1]}"
)
@@ -0,0 +1,674 @@
"""Per-frame matching pipeline shared by DISK+LightGlue and ALIKED+LightGlue (AZ-345/AZ-346).
The DISK and ALIKED matchers differ only in their backbone keypoint
extractor — every other step (per-candidate loop, drop-and-continue,
RANSAC filtering, sort, health-window update, FDR emission) is
identical. To avoid copy-pasting ~200 lines between
``disk_lightglue.py`` and ``aliked_lightglue.py`` (and the bugs that
copy-paste invites), the shared orchestration lives here as a single
``run_lightglue_pipeline`` callable that takes the per-backbone
extractor as a parameter.
XFeat (AZ-347) uses its own pipeline because its backbone fuses
feature extraction and matching into one forward pass — see
``xfeat.py``.
The module is intentionally NOT public: it's imported only by the
sibling concrete strategy modules. The ``_pipeline`` prefix keeps it
out of the public API contract.
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Callable
import cv2
import numpy as np
from gps_denied_onboard._types.matcher import CandidateMatchSet, MatchResult
from gps_denied_onboard.components.c3_matcher.errors import (
InsufficientInliersError,
MatcherBackboneError,
)
from gps_denied_onboard.fdr_client import EnqueueResult, FdrRecord
from gps_denied_onboard.fdr_client.records import CURRENT_SCHEMA_VERSION
from gps_denied_onboard.helpers.iso_timestamps import iso_ts_from_clock
from gps_denied_onboard.helpers.ransac_filter import RansacFilterError
if TYPE_CHECKING:
from gps_denied_onboard._types.matching import KeypointSet
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.rerank import RerankCandidate, RerankResult
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c3_matcher._health_window import (
RollingHealthWindow,
)
from gps_denied_onboard.fdr_client import FdrClient
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntime
from gps_denied_onboard.helpers.ransac_filter import RansacFilter
__all__ = [
"QueryExtractError",
"TileDecodeError",
"TileExtractError",
"decode_bgr_image",
"run_lightglue_pipeline",
]
_FDR_KIND_FRAME_DONE = "matcher.frame_done"
_FDR_KIND_BACKBONE_ERROR = "matcher.backbone_error"
_FDR_KIND_INSUFFICIENT = "matcher.insufficient_inliers"
_FDR_KIND_ALL_FAILED = "matcher.all_failed"
_LOG_KIND_RESIDUAL_WARN = "c3.matcher.residual_above_threshold"
_LOG_KIND_BACKBONE_ERROR = "c3.matcher.backbone_error"
_LOG_KIND_INSUFFICIENT = "c3.matcher.insufficient_inliers"
_LOG_KIND_ZERO_INLIERS = "c3.matcher.zero_inliers"
_LOG_KIND_FDR_OVERRUN = "c3.matcher.fdr_overrun"
class QueryExtractError(MatcherBackboneError):
"""Internal flag for the query-frame extraction failure path."""
class TileDecodeError(MatcherBackboneError):
"""Internal flag for tile-pixel decode failures (corrupt JPEG)."""
class TileExtractError(MatcherBackboneError):
"""Internal flag for per-tile keypoint extraction failures."""
def decode_bgr_image(image: object) -> np.ndarray | None:
"""Coerce ``NavCameraFrame.image`` or a JPEG byte buffer to BGR ``ndarray``.
Accepts an already-decoded ``np.ndarray`` (returned as-is) or a JPEG/PNG
byte buffer (decoded via ``cv2.imdecode``). Anything else returns
``None``; callers route the ``None`` through the backbone-error drop
path. Mirrors :func:`_ensure_bgr_array` in the C2.5 reranker so both
components decode tile pixels identically.
"""
if isinstance(image, np.ndarray):
return image
if isinstance(image, (bytes, bytearray, memoryview)):
data = bytes(image)
if not data:
return None
buf = np.frombuffer(data, dtype=np.uint8)
return cv2.imdecode(buf, cv2.IMREAD_COLOR)
return None
def _decode_tile_jpeg(jpeg_view: memoryview) -> np.ndarray:
"""Decode a JPEG ``memoryview`` produced by C6 into a BGR ``ndarray``."""
data = bytes(jpeg_view)
if not data:
raise ValueError("empty JPEG buffer")
buf = np.frombuffer(data, dtype=np.uint8)
decoded = cv2.imdecode(buf, cv2.IMREAD_COLOR)
if decoded is None:
raise ValueError("cv2.imdecode returned None for tile JPEG")
return decoded
def run_lightglue_pipeline(
*,
frame: "NavCameraFrame",
rerank_result: "RerankResult",
matcher_label: str,
extract_features: Callable[[np.ndarray], "KeypointSet"],
lightglue_runtime: "LightGlueRuntime",
ransac_filter: type["RansacFilter"],
health_window: "RollingHealthWindow",
clock: "Clock",
fdr_client: "FdrClient | None",
fdr_producer_id: str,
min_inliers_threshold: int,
ransac_threshold_px: float,
residual_warn_threshold_px: float,
logger: logging.Logger,
) -> MatchResult:
"""DISK / ALIKED per-candidate match loop.
The single substantive parameter is ``extract_features`` — the
backbone-specific image → :class:`KeypointSet` callable. Every
other parameter is plumbing the per-strategy class already holds
by reference. Returns a :class:`MatchResult` or raises
:class:`InsufficientInliersError`.
"""
candidates_input = len(rerank_result.candidates)
frame_id = int(rerank_result.frame_id)
dropped = 0
survivors: list[CandidateMatchSet] = []
query_image = decode_bgr_image(frame.image)
if query_image is None:
_emit_backbone_error(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
matcher_label=matcher_label,
frame_id=frame_id,
tile_id=None,
phase="query_extract",
error_type="ImageDecodeError",
error_message="NavCameraFrame.image could not be decoded",
)
_fail_all(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
health_window=health_window,
matcher_label=matcher_label,
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=candidates_input,
now_ns=int(clock.monotonic_ns()),
)
try:
query_features = extract_features(query_image)
except Exception as exc:
_emit_backbone_error(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
matcher_label=matcher_label,
frame_id=frame_id,
tile_id=None,
phase="query_extract",
error_type=type(exc).__name__,
error_message=str(exc),
)
_fail_all(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
health_window=health_window,
matcher_label=matcher_label,
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=candidates_input,
now_ns=int(clock.monotonic_ns()),
)
for candidate in rerank_result.candidates:
survivor = _process_candidate(
candidate=candidate,
query_features=query_features,
extract_features=extract_features,
lightglue_runtime=lightglue_runtime,
ransac_filter=ransac_filter,
ransac_threshold_px=ransac_threshold_px,
min_inliers_threshold=min_inliers_threshold,
matcher_label=matcher_label,
frame_id=frame_id,
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
)
if survivor is None:
dropped += 1
continue
survivors.append(survivor)
now_ns = int(clock.monotonic_ns())
had_backbone_error = dropped > 0
if not survivors:
_fail_all(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
health_window=health_window,
matcher_label=matcher_label,
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=dropped,
now_ns=now_ns,
had_backbone_error=had_backbone_error,
)
survivors.sort(key=lambda s: (-s.inlier_count, s.per_candidate_residual_px))
max_inliers = survivors[0].inlier_count
if max_inliers < min_inliers_threshold:
_emit_insufficient_inliers(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
matcher_label=matcher_label,
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=dropped,
max_inlier_count=max_inliers,
)
health_window.update(
timestamp_ns=now_ns,
best_inlier_count=0,
had_backbone_error=had_backbone_error,
)
raise InsufficientInliersError(
f"{matcher_label}.match: every survivor's inlier_count is below "
f"min_inliers_threshold={min_inliers_threshold} "
f"(max_inlier_count={max_inliers}, frame_id={frame_id})"
)
best = survivors[0]
if best.per_candidate_residual_px > residual_warn_threshold_px:
logger.warning(
_LOG_KIND_RESIDUAL_WARN,
extra={
"kind": _LOG_KIND_RESIDUAL_WARN,
"kv": {
"frame_id": frame_id,
"matcher_label": matcher_label,
"residual_px": float(best.per_candidate_residual_px),
"threshold_px": float(residual_warn_threshold_px),
},
},
)
health_window.update(
timestamp_ns=now_ns,
best_inlier_count=int(best.inlier_count),
had_backbone_error=had_backbone_error,
)
result = MatchResult(
frame_id=frame_id,
per_candidate=tuple(survivors),
best_candidate_idx=0,
reprojection_residual_px=float(best.per_candidate_residual_px),
matched_at=now_ns,
matcher_label=matcher_label,
candidates_input=candidates_input,
candidates_dropped=dropped,
)
_emit_frame_done(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
result=result,
)
return result
def _process_candidate(
*,
candidate: "RerankCandidate",
query_features: "KeypointSet",
extract_features: Callable[[np.ndarray], "KeypointSet"],
lightglue_runtime: "LightGlueRuntime",
ransac_filter: type["RansacFilter"],
ransac_threshold_px: float,
min_inliers_threshold: int,
matcher_label: str,
frame_id: int,
logger: logging.Logger,
fdr_client: "FdrClient | None",
fdr_producer_id: str,
clock: "Clock",
) -> CandidateMatchSet | None:
tile_id = candidate.tile_id
handle = candidate.tile_pixels_handle
try:
with handle as jpeg_view:
tile_image = _decode_tile_jpeg(jpeg_view)
except (ValueError, AttributeError, TypeError) as exc:
_emit_backbone_error(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
matcher_label=matcher_label,
frame_id=frame_id,
tile_id=tile_id,
phase="tile_decode",
error_type=type(exc).__name__,
error_message=str(exc),
)
return None
try:
tile_features = extract_features(tile_image)
except Exception as exc:
_emit_backbone_error(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
matcher_label=matcher_label,
frame_id=frame_id,
tile_id=tile_id,
phase="tile_extract",
error_type=type(exc).__name__,
error_message=str(exc),
)
return None
try:
correspondences = lightglue_runtime.match(query_features, tile_features)
except Exception as exc:
_emit_backbone_error(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
matcher_label=matcher_label,
frame_id=frame_id,
tile_id=tile_id,
phase="lightglue_match",
error_type=type(exc).__name__,
error_message=str(exc),
)
return None
raw = np.asarray(correspondences.correspondences, dtype=np.float32)
if raw.ndim != 2 or raw.shape[1] != 4 or raw.shape[0] < 4:
# Below RANSAC's homography minimum (≥4 pairs); drop quietly
# at DEBUG. AC-9 contract still requires shape (I, 4); zero
# survivors are absorbed by the drop-and-continue accounting.
logger.debug(
_LOG_KIND_ZERO_INLIERS,
extra={
"kind": _LOG_KIND_ZERO_INLIERS,
"kv": {
"frame_id": frame_id,
"matcher_label": matcher_label,
"tile_id": list(tile_id),
"correspondences": int(raw.shape[0]) if raw.ndim == 2 else 0,
},
},
)
return None
try:
ransac_result = ransac_filter.filter_correspondences(
raw,
ransac_threshold_px,
min_inliers_threshold,
)
except RansacFilterError as exc:
_emit_backbone_error(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
matcher_label=matcher_label,
frame_id=frame_id,
tile_id=tile_id,
phase="ransac",
error_type=type(exc).__name__,
error_message=str(exc),
)
return None
if ransac_result.inlier_count == 0:
logger.debug(
_LOG_KIND_ZERO_INLIERS,
extra={
"kind": _LOG_KIND_ZERO_INLIERS,
"kv": {
"frame_id": frame_id,
"matcher_label": matcher_label,
"tile_id": list(tile_id),
},
},
)
return None
inliers = np.ascontiguousarray(
ransac_result.inlier_correspondences, dtype=np.float32
)
return CandidateMatchSet(
tile_id=tile_id,
inlier_count=int(ransac_result.inlier_count),
inlier_correspondences=inliers,
ransac_outlier_count=int(ransac_result.outlier_count),
per_candidate_residual_px=float(ransac_result.median_residual_px),
)
def _emit_frame_done(
*,
logger: logging.Logger,
fdr_client: "FdrClient | None",
fdr_producer_id: str,
clock: "Clock",
result: MatchResult,
) -> None:
if fdr_client is None:
return
best = result.per_candidate[result.best_candidate_idx]
record = FdrRecord(
schema_version=CURRENT_SCHEMA_VERSION,
ts=iso_ts_from_clock(clock),
producer_id=fdr_producer_id,
kind=_FDR_KIND_FRAME_DONE,
payload={
"frame_id": int(result.frame_id),
"matcher_label": result.matcher_label,
"candidates_input": int(result.candidates_input),
"candidates_dropped": int(result.candidates_dropped),
"best_inlier_count": int(best.inlier_count),
"best_residual_px": float(best.per_candidate_residual_px),
"best_tile_id": list(best.tile_id),
},
)
_safe_enqueue(logger, fdr_client, record, frame_id=result.frame_id)
def _emit_backbone_error(
*,
logger: logging.Logger,
fdr_client: "FdrClient | None",
fdr_producer_id: str,
clock: "Clock",
matcher_label: str,
frame_id: int,
tile_id: tuple[int, float, float] | None,
phase: str,
error_type: str,
error_message: str,
) -> None:
kv: dict[str, object] = {
"frame_id": frame_id,
"matcher_label": matcher_label,
"phase": phase,
"error_type": error_type,
}
if tile_id is not None:
kv["tile_id"] = list(tile_id)
logger.error(
_LOG_KIND_BACKBONE_ERROR,
extra={"kind": _LOG_KIND_BACKBONE_ERROR, "kv": kv},
)
if fdr_client is None:
return
payload: dict[str, object] = {
"frame_id": int(frame_id),
"matcher_label": matcher_label,
"phase": phase,
"error_type": error_type,
"error_message": error_message[:512],
}
if tile_id is not None:
payload["tile_id"] = list(tile_id)
record = FdrRecord(
schema_version=CURRENT_SCHEMA_VERSION,
ts=iso_ts_from_clock(clock),
producer_id=fdr_producer_id,
kind=_FDR_KIND_BACKBONE_ERROR,
payload=payload,
)
_safe_enqueue(logger, fdr_client, record, frame_id=frame_id)
def _emit_insufficient_inliers(
*,
logger: logging.Logger,
fdr_client: "FdrClient | None",
fdr_producer_id: str,
clock: "Clock",
matcher_label: str,
frame_id: int,
candidates_input: int,
candidates_dropped: int,
max_inlier_count: int,
) -> None:
logger.error(
_LOG_KIND_INSUFFICIENT,
extra={
"kind": _LOG_KIND_INSUFFICIENT,
"kv": {
"frame_id": frame_id,
"matcher_label": matcher_label,
"candidates_input": candidates_input,
"candidates_dropped": candidates_dropped,
"max_inlier_count": max_inlier_count,
},
},
)
if fdr_client is None:
return
record = FdrRecord(
schema_version=CURRENT_SCHEMA_VERSION,
ts=iso_ts_from_clock(clock),
producer_id=fdr_producer_id,
kind=_FDR_KIND_INSUFFICIENT,
payload={
"frame_id": int(frame_id),
"matcher_label": matcher_label,
"candidates_input": int(candidates_input),
"candidates_dropped": int(candidates_dropped),
"max_inlier_count": int(max_inlier_count),
},
)
_safe_enqueue(logger, fdr_client, record, frame_id=frame_id)
def _emit_all_failed(
*,
logger: logging.Logger,
fdr_client: "FdrClient | None",
fdr_producer_id: str,
clock: "Clock",
matcher_label: str,
frame_id: int,
candidates_input: int,
candidates_dropped: int,
) -> None:
logger.error(
"c3.matcher.all_failed",
extra={
"kind": "c3.matcher.all_failed",
"kv": {
"frame_id": frame_id,
"matcher_label": matcher_label,
"candidates_input": candidates_input,
"candidates_dropped": candidates_dropped,
},
},
)
if fdr_client is None:
return
record = FdrRecord(
schema_version=CURRENT_SCHEMA_VERSION,
ts=iso_ts_from_clock(clock),
producer_id=fdr_producer_id,
kind=_FDR_KIND_ALL_FAILED,
payload={
"frame_id": int(frame_id),
"matcher_label": matcher_label,
"candidates_input": int(candidates_input),
"candidates_dropped": int(candidates_dropped),
},
)
_safe_enqueue(logger, fdr_client, record, frame_id=frame_id)
def _fail_all(
*,
logger: logging.Logger,
fdr_client: "FdrClient | None",
fdr_producer_id: str,
clock: "Clock",
health_window: "RollingHealthWindow",
matcher_label: str,
frame_id: int,
candidates_input: int,
candidates_dropped: int,
now_ns: int,
had_backbone_error: bool = True,
) -> None:
"""Emit ``matcher.all_failed`` + update health window + raise ``InsufficientInliersError``.
Always reached when there are zero survivors (every candidate
dropped OR no candidates supplied). ``had_backbone_error``
defaults to True because zero-survivor paths in practice imply at
least one backbone failure; callers override to False on the
"empty rerank input" sub-case (the rerank produced nothing — not
a matcher backbone problem).
"""
_emit_all_failed(
logger=logger,
fdr_client=fdr_client,
fdr_producer_id=fdr_producer_id,
clock=clock,
matcher_label=matcher_label,
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=candidates_dropped,
)
health_window.update(
timestamp_ns=now_ns,
best_inlier_count=0,
had_backbone_error=had_backbone_error,
)
raise InsufficientInliersError(
f"{matcher_label}.match: zero survivors "
f"(frame_id={frame_id}, candidates_input={candidates_input}, "
f"candidates_dropped={candidates_dropped})"
)
def _safe_enqueue(
logger: logging.Logger,
fdr_client: "FdrClient",
record: FdrRecord,
*,
frame_id: int,
) -> None:
try:
result = fdr_client.enqueue(record)
except Exception as exc:
# FDR enqueue failures are observability-only; they must
# NEVER promote to a matcher drop event.
logger.debug(
"c3.matcher.fdr_enqueue_failed",
extra={
"kind": "c3.matcher.fdr_enqueue_failed",
"kv": {"frame_id": frame_id, "error": repr(exc)},
},
)
return
if result == EnqueueResult.OVERRUN:
logger.warning(
_LOG_KIND_FDR_OVERRUN,
extra={
"kind": _LOG_KIND_FDR_OVERRUN,
"kv": {"frame_id": frame_id, "record_kind": record.kind},
},
)
@@ -0,0 +1,289 @@
"""``AlikedLightGlueMatcher`` — C3 secondary CrossDomainMatcher (AZ-346).
Mirrors :mod:`gps_denied_onboard.components.c3_matcher.disk_lightglue`
modulo backbone choice: ALIKED replaces DISK for the per-frame
keypoint+descriptor extraction step; LightGlue + RANSAC stages are
unchanged.
ALIKED is the documented fallback if a future D-C3-1 IT-12 verdict
shifts away from DISK, or if DISK's licensing / upstream maintenance
changes mid-cycle. Both backbones ship in airborne / research
binaries (ADR-002 allows multiple matchers at link time; only one is
selected at runtime by ``config.matcher.strategy``).
Preprocessor parameters (input size, normalisation) are hard-coded
per AZ-346 Constraint — weights-coupled, same rule as DISK.
See ``disk_lightglue.py`` for the layering + composition rationale;
this module follows the same pattern.
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Final, Literal
import cv2
import numpy as np
from gps_denied_onboard._types.inference import (
BuildConfig,
PrecisionMode,
)
from gps_denied_onboard._types.matching import KeypointSet
from gps_denied_onboard.components.c3_matcher._engine_output_assertion import (
assert_keypoint_engine_output_schema,
)
from gps_denied_onboard.components.c3_matcher._pipeline import (
TileExtractError,
run_lightglue_pipeline,
)
from gps_denied_onboard.components.c3_matcher.inference_runtime_cut import (
InferenceRuntimeCut,
)
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.matcher import MatchResult, MatcherHealth
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.rerank import RerankResult
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c3_matcher._health_window import (
RollingHealthWindow,
)
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrClient
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntime
from gps_denied_onboard.helpers.ransac_filter import RansacFilter
__all__ = ["MODEL_NAME", "AlikedLightGlueMatcher", "create"]
MODEL_NAME: Final[str] = "aliked"
_MATCHER_LABEL: Final[Literal["aliked_lightglue"]] = "aliked_lightglue"
_FDR_PRODUCER_ID: Final[str] = "c3_matcher.aliked_lightglue"
_LOG_KIND_READY: Final[str] = "c3.matcher.ready"
# ALIKED input contract: NCHW float32 RGB at 480x480, ImageNet-style
# normalisation. Hard-coded per Constraint — weights-coupled.
_INPUT_SIZE: Final[int] = 480
_INPUT_KEY: Final[str] = "image"
_OUTPUT_KEYPOINTS: Final[str] = "keypoints"
_OUTPUT_DESCRIPTORS: Final[str] = "descriptors"
# ImageNet mean/std for ALIKED. Same convention as UltraVPR's preprocessor.
_IMAGENET_MEAN: Final[tuple[float, float, float]] = (0.485, 0.456, 0.406)
_IMAGENET_STD: Final[tuple[float, float, float]] = (0.229, 0.224, 0.225)
class AlikedLightGlueMatcher:
"""ALIKED + LightGlue cross-domain matcher.
Stateless per-frame except for the rolling health window. See
module docstring for the architectural picture and the
``cross_domain_matcher_protocol.md`` v1.0.0 contract for the
public invariants this class satisfies.
"""
def __init__(
self,
config: "Config",
*,
lightglue_runtime: "LightGlueRuntime",
ransac_filter: type["RansacFilter"],
inference_runtime: InferenceRuntimeCut,
health_window: "RollingHealthWindow",
engine_handle: object,
clock: "Clock",
fdr_client: "FdrClient | None",
logger: logging.Logger,
) -> None:
block = config.components["c3_matcher"]
self._config = config
self._lightglue_runtime = lightglue_runtime
self._ransac_filter = ransac_filter
self._inference_runtime = inference_runtime
self._engine_handle = engine_handle
self._health_window = health_window
self._clock = clock
self._fdr_client = fdr_client
self._logger = logger
self._min_inliers_threshold: int = int(block.min_inliers_threshold)
self._residual_warn_threshold_px: float = float(
block.residual_warn_threshold_px
)
self._ransac_threshold_px: float = float(block.ransac_threshold_px)
def match(
self,
frame: "NavCameraFrame",
rerank_result: "RerankResult",
calibration: "CameraCalibration",
) -> "MatchResult":
del calibration
return run_lightglue_pipeline(
frame=frame,
rerank_result=rerank_result,
matcher_label=_MATCHER_LABEL,
extract_features=self._extract_features,
lightglue_runtime=self._lightglue_runtime,
ransac_filter=self._ransac_filter,
health_window=self._health_window,
clock=self._clock,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
min_inliers_threshold=self._min_inliers_threshold,
ransac_threshold_px=self._ransac_threshold_px,
residual_warn_threshold_px=self._residual_warn_threshold_px,
logger=self._logger,
)
def health_snapshot(self) -> "MatcherHealth":
return self._health_window.snapshot()
def _extract_features(self, image_bgr: np.ndarray) -> KeypointSet:
tensor = _preprocess_aliked(image_bgr)
outputs = self._inference_runtime.infer(
self._engine_handle, {_INPUT_KEY: tensor}
)
return _outputs_to_keypoint_set(outputs)
def _preprocess_aliked(image_bgr: np.ndarray) -> np.ndarray:
"""BGR → RGB 480×480 ImageNet-normalised float32 NCHW.
Hard-coded weights-coupled preprocessing. Accepts (H, W, 3) BGR
or (H, W) grayscale (broadcast to 3 channels).
"""
if image_bgr.ndim == 3:
rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)
elif image_bgr.ndim == 2:
rgb = cv2.cvtColor(image_bgr, cv2.COLOR_GRAY2RGB)
else:
raise TileExtractError(
f"ALIKED preprocessor: image must be 2-D (gray) or 3-D (BGR); "
f"got ndim={image_bgr.ndim} shape={image_bgr.shape}"
)
if rgb.shape[0] != _INPUT_SIZE or rgb.shape[1] != _INPUT_SIZE:
rgb = cv2.resize(
rgb, (_INPUT_SIZE, _INPUT_SIZE), interpolation=cv2.INTER_AREA
)
rgb_f = rgb.astype(np.float32) / 255.0
mean = np.asarray(_IMAGENET_MEAN, dtype=np.float32)
std = np.asarray(_IMAGENET_STD, dtype=np.float32)
normalised = (rgb_f - mean) / std
tensor = np.transpose(normalised, (2, 0, 1))[None, :, :, :]
return np.ascontiguousarray(tensor, dtype=np.float32)
def _outputs_to_keypoint_set(outputs: dict[str, np.ndarray]) -> KeypointSet:
if _OUTPUT_KEYPOINTS not in outputs or _OUTPUT_DESCRIPTORS not in outputs:
raise TileExtractError(
f"ALIKED forward returned unexpected keys: "
f"{sorted(outputs.keys())!r}; expected {_OUTPUT_KEYPOINTS!r} + "
f"{_OUTPUT_DESCRIPTORS!r}"
)
keypoints = np.asarray(outputs[_OUTPUT_KEYPOINTS], dtype=np.float32)
descriptors = np.asarray(outputs[_OUTPUT_DESCRIPTORS], dtype=np.float32)
if keypoints.ndim != 2 or keypoints.shape[1] != 2:
raise TileExtractError(
f"ALIKED keypoints must have shape (N, 2); got {keypoints.shape}"
)
if descriptors.ndim != 2 or descriptors.shape[0] != keypoints.shape[0]:
raise TileExtractError(
f"ALIKED descriptors shape {descriptors.shape} inconsistent with "
f"keypoints {keypoints.shape}"
)
return KeypointSet(keypoints=keypoints, descriptors=descriptors)
def _build_aliked_build_config() -> BuildConfig:
return BuildConfig(
precision=PrecisionMode.FP16,
workspace_mb=512,
calibration_dataset=None,
optimization_profiles=(),
)
def create(
config: "Config",
*,
lightglue_runtime: "LightGlueRuntime",
ransac_filter: type["RansacFilter"],
inference_runtime: InferenceRuntimeCut,
health_window: "RollingHealthWindow",
clock: "Clock | None" = None,
fdr_client: "FdrClient | None" = None,
logger: logging.Logger | None = None,
) -> "AlikedLightGlueMatcher":
"""Module-level factory consumed by :func:`build_matcher_strategy`."""
block = config.components["c3_matcher"]
weights_path = block.aliked_weights_path
if weights_path is None:
raise ValueError(
"AlikedLightGlueMatcher.create: config.components['c3_matcher']"
".aliked_weights_path is None; the runtime root MUST populate "
"the ALIKED engine path before constructing this strategy."
)
if clock is None:
from gps_denied_onboard.clock.wall_clock import WallClock
clock = WallClock()
if logger is None:
logger = logging.getLogger("gps_denied_onboard.c3_matcher.aliked_lightglue")
cache_entry = inference_runtime.compile_engine(
weights_path, _build_aliked_build_config()
)
entry_for_deserialize = type(cache_entry)(
engine_path=cache_entry.engine_path,
sha256_hex=cache_entry.sha256_hex,
sm=cache_entry.sm,
jp=cache_entry.jp,
trt=cache_entry.trt,
precision=cache_entry.precision,
extras={**cache_entry.extras, "model_name": MODEL_NAME},
)
engine_handle = inference_runtime.deserialize_engine(entry_for_deserialize)
# AZ-346 AC-special-1: validate the engine's output schema once
# at startup so a misconfigured ALIKED engine surfaces as a
# composition-time ConfigError instead of a mid-flight
# InsufficientInliersError cascade.
assert_keypoint_engine_output_schema(
inference_runtime,
engine_handle,
input_size=_INPUT_SIZE,
input_key=_INPUT_KEY,
keypoints_key=_OUTPUT_KEYPOINTS,
descriptors_key=_OUTPUT_DESCRIPTORS,
)
logger.info(
_LOG_KIND_READY,
extra={
"kind": _LOG_KIND_READY,
"kv": {
"strategy": _MATCHER_LABEL,
"min_inliers_threshold": int(block.min_inliers_threshold),
"residual_warn_threshold_px": float(
block.residual_warn_threshold_px
),
"ransac_threshold_px": float(block.ransac_threshold_px),
},
},
)
return AlikedLightGlueMatcher(
config,
lightglue_runtime=lightglue_runtime,
ransac_filter=ransac_filter,
inference_runtime=inference_runtime,
health_window=health_window,
engine_handle=engine_handle,
clock=clock,
fdr_client=fdr_client,
logger=logger,
)
@@ -1,4 +1,4 @@
"""C3 ``CrossDomainMatcher`` config block (AZ-344).
"""C3 ``CrossDomainMatcher`` config block (AZ-344 + AZ-345/346/347).
Registered into ``config.components['c3_matcher']`` by the package
``__init__.py``. The composition-root factory
@@ -21,11 +21,28 @@ tune it.
``residual_warn_threshold_px`` is the median reprojection-residual
limit (pixels) above which the matcher emits a WARN log; default
2.5 px (the AC-1.2 floor).
``ransac_threshold_px`` is the RANSAC reprojection-threshold passed
to the shared :class:`gps_denied_onboard.helpers.ransac_filter.RansacFilter`
inside each per-candidate match. Default 3.0 px (matches C2.5's
single-pair threshold; C3 explicitly re-runs the filter on the
LightGlue / XFeat correspondences because the homography under
satellite ↔ nav has a different geometric envelope).
``disk_weights_path`` / ``aliked_weights_path`` / ``xfeat_weights_path``
point at the engine files produced offline by AZ-321; the
composition-root factory passes them through to the concrete
strategy's ``create(...)`` factory. ``adhop_weights_path`` is read
by C3.5; it lives in :class:`C3_5RefinerConfig` (sibling block) —
not duplicated here. ``None`` is allowed as a placeholder for
``BUILD_MATCHER_<variant>=OFF`` binaries; the factory rejects
``None`` only for the SELECTED strategy at composition time.
"""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
from typing import Final
from gps_denied_onboard.config.schema import ConfigError
@@ -48,6 +65,10 @@ class C3MatcherConfig:
strategy: str = "disk_lightglue"
min_inliers_threshold: int = 60
residual_warn_threshold_px: float = 2.5
ransac_threshold_px: float = 3.0
disk_weights_path: Path | None = None
aliked_weights_path: Path | None = None
xfeat_weights_path: Path | None = None
def __post_init__(self) -> None:
if self.strategy not in KNOWN_STRATEGIES:
@@ -65,3 +86,15 @@ class C3MatcherConfig:
"C3MatcherConfig.residual_warn_threshold_px must be > 0; "
f"got {self.residual_warn_threshold_px}"
)
if self.ransac_threshold_px <= 0.0:
raise ConfigError(
"C3MatcherConfig.ransac_threshold_px must be > 0; "
f"got {self.ransac_threshold_px}"
)
for name in ("disk_weights_path", "aliked_weights_path", "xfeat_weights_path"):
value = getattr(self, name)
if value is not None and not isinstance(value, Path):
raise ConfigError(
f"C3MatcherConfig.{name} must be a pathlib.Path or None; "
f"got {type(value).__name__}"
)
@@ -0,0 +1,288 @@
"""``DiskLightGlueMatcher`` — C3 production-default CrossDomainMatcher (AZ-345).
Per ``components/04_c3_matcher/description.md`` § 1 (D-C3-1 = (a)) DISK is
the cross-domain feature extractor of choice; LightGlue matches the
DISK keypoints from the nav frame against each top-N rerank candidate's
tile, and ``RansacFilter`` (AZ-282) recomputes the homography-RANSAC
inlier set + median reprojection residual per candidate.
The per-frame orchestration (drop-and-continue, deterministic sort,
``RollingHealthWindow.update``, FDR + log emission) lives in
:mod:`gps_denied_onboard.components.c3_matcher._pipeline` so the three
LightGlue-based backbones (DISK, ALIKED — and a future descendant)
cannot drift in semantics; this module only contributes the
DISK-specific image preprocessor, engine loading, and forward-pass
wrapper.
The preprocessor parameters are hard-coded per AZ-345 Constraint:
weights-coupled — making them config knobs would let an operator
silently violate the AC-1.1 inlier-count floor. Future weight drops
that change the input contract require updating ``MODEL_NAME`` /
``_INPUT_SIZE`` in tandem with the engine compile (AZ-321), not a
runtime config override.
Layering:
- This module imports from L1 (`_types`, `helpers`, `fdr_client`,
`clock`, `config`) and from sibling L3 modules inside the same
component (`_pipeline`, `_health_window`, `errors`, `config`,
`interface`).
- It MUST NOT import from another L3 component; the C7 inference
runtime is accepted through the consumer-side
:class:`InferenceRuntimeCut` Protocol (AZ-507).
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Final, Literal
import cv2
import numpy as np
from gps_denied_onboard._types.inference import (
BuildConfig,
PrecisionMode,
)
from gps_denied_onboard._types.matching import KeypointSet
from gps_denied_onboard.components.c3_matcher._pipeline import (
TileExtractError,
run_lightglue_pipeline,
)
from gps_denied_onboard.components.c3_matcher.inference_runtime_cut import (
InferenceRuntimeCut,
)
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.matcher import MatchResult, MatcherHealth
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.rerank import RerankResult
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c3_matcher._health_window import (
RollingHealthWindow,
)
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrClient
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntime
from gps_denied_onboard.helpers.ransac_filter import RansacFilter
__all__ = ["MODEL_NAME", "DiskLightGlueMatcher", "create"]
MODEL_NAME: Final[str] = "disk"
_MATCHER_LABEL: Final[Literal["disk_lightglue"]] = "disk_lightglue"
_FDR_PRODUCER_ID: Final[str] = "c3_matcher.disk_lightglue"
_LOG_KIND_READY: Final[str] = "c3.matcher.ready"
# DISK input contract: NCHW float32 grayscale at 480x480, normalised
# to [0, 1]. Hard-coded per Constraint — weights-coupled.
_INPUT_SIZE: Final[int] = 480
_INPUT_KEY: Final[str] = "image"
_OUTPUT_KEYPOINTS: Final[str] = "keypoints"
_OUTPUT_DESCRIPTORS: Final[str] = "descriptors"
class DiskLightGlueMatcher:
"""DISK + LightGlue cross-domain matcher.
Stateless per-frame except for the rolling health window. See
module docstring for the architectural picture and the
``cross_domain_matcher_protocol.md`` v1.0.0 contract for the
public invariants this class satisfies.
"""
def __init__(
self,
config: "Config",
*,
lightglue_runtime: "LightGlueRuntime",
ransac_filter: type["RansacFilter"],
inference_runtime: InferenceRuntimeCut,
health_window: "RollingHealthWindow",
engine_handle: object,
clock: "Clock",
fdr_client: "FdrClient | None",
logger: logging.Logger,
) -> None:
block = config.components["c3_matcher"]
self._config = config
self._lightglue_runtime = lightglue_runtime
self._ransac_filter = ransac_filter
self._inference_runtime = inference_runtime
self._engine_handle = engine_handle
self._health_window = health_window
self._clock = clock
self._fdr_client = fdr_client
self._logger = logger
self._min_inliers_threshold: int = int(block.min_inliers_threshold)
self._residual_warn_threshold_px: float = float(block.residual_warn_threshold_px)
self._ransac_threshold_px: float = float(block.ransac_threshold_px)
def match(
self,
frame: "NavCameraFrame",
rerank_result: "RerankResult",
calibration: "CameraCalibration",
) -> "MatchResult":
del calibration # not consumed at the matcher boundary (yet); C4 uses it
return run_lightglue_pipeline(
frame=frame,
rerank_result=rerank_result,
matcher_label=_MATCHER_LABEL,
extract_features=self._extract_features,
lightglue_runtime=self._lightglue_runtime,
ransac_filter=self._ransac_filter,
health_window=self._health_window,
clock=self._clock,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
min_inliers_threshold=self._min_inliers_threshold,
ransac_threshold_px=self._ransac_threshold_px,
residual_warn_threshold_px=self._residual_warn_threshold_px,
logger=self._logger,
)
def health_snapshot(self) -> "MatcherHealth":
return self._health_window.snapshot()
def _extract_features(self, image_bgr: np.ndarray) -> KeypointSet:
"""Preprocess + DISK forward → ``KeypointSet`` (single image)."""
tensor = _preprocess_disk(image_bgr)
outputs = self._inference_runtime.infer(
self._engine_handle, {_INPUT_KEY: tensor}
)
return _outputs_to_keypoint_set(outputs)
def _preprocess_disk(image_bgr: np.ndarray) -> np.ndarray:
"""BGR → grayscale 480×480 float32 NCHW in ``[0, 1]``.
Hard-coded weights-coupled preprocessing. Accepts (H, W, 3) BGR
or (H, W) grayscale; resizes preserving DISK's training input
contract.
"""
if image_bgr.ndim == 3:
gray = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2GRAY)
elif image_bgr.ndim == 2:
gray = image_bgr
else:
raise TileExtractError(
f"DISK preprocessor: image must be 2-D (gray) or 3-D (BGR); "
f"got ndim={image_bgr.ndim} shape={image_bgr.shape}"
)
if gray.shape[0] != _INPUT_SIZE or gray.shape[1] != _INPUT_SIZE:
gray = cv2.resize(
gray, (_INPUT_SIZE, _INPUT_SIZE), interpolation=cv2.INTER_AREA
)
if gray.dtype != np.float32:
gray = gray.astype(np.float32) / 255.0
else:
gray = gray / 255.0
return gray.reshape(1, 1, _INPUT_SIZE, _INPUT_SIZE).astype(np.float32, copy=False)
def _outputs_to_keypoint_set(outputs: dict[str, np.ndarray]) -> KeypointSet:
if _OUTPUT_KEYPOINTS not in outputs or _OUTPUT_DESCRIPTORS not in outputs:
raise TileExtractError(
f"DISK forward returned unexpected keys: {sorted(outputs.keys())!r}; "
f"expected {_OUTPUT_KEYPOINTS!r} + {_OUTPUT_DESCRIPTORS!r}"
)
keypoints = np.asarray(outputs[_OUTPUT_KEYPOINTS], dtype=np.float32)
descriptors = np.asarray(outputs[_OUTPUT_DESCRIPTORS], dtype=np.float32)
if keypoints.ndim != 2 or keypoints.shape[1] != 2:
raise TileExtractError(
f"DISK keypoints must have shape (N, 2); got {keypoints.shape}"
)
if descriptors.ndim != 2 or descriptors.shape[0] != keypoints.shape[0]:
raise TileExtractError(
f"DISK descriptors shape {descriptors.shape} inconsistent with "
f"keypoints {keypoints.shape}"
)
return KeypointSet(keypoints=keypoints, descriptors=descriptors)
def _build_disk_build_config() -> BuildConfig:
return BuildConfig(
precision=PrecisionMode.FP16,
workspace_mb=512,
calibration_dataset=None,
optimization_profiles=(),
)
def create(
config: "Config",
*,
lightglue_runtime: "LightGlueRuntime",
ransac_filter: type["RansacFilter"],
inference_runtime: InferenceRuntimeCut,
health_window: "RollingHealthWindow",
clock: "Clock | None" = None,
fdr_client: "FdrClient | None" = None,
logger: logging.Logger | None = None,
) -> "DiskLightGlueMatcher":
"""Module-level factory consumed by :func:`build_matcher_strategy`.
Loads the DISK engine ONCE at composition time; subsequent
``match`` calls re-use the resolved handle. The composition root
supplies ``clock`` and ``fdr_client``; both are optional in the
factory signature so unit-tests that drive the matcher
end-to-end with a fake runtime can opt out of FDR emission.
"""
block = config.components["c3_matcher"]
weights_path = block.disk_weights_path
if weights_path is None:
raise ValueError(
"DiskLightGlueMatcher.create: config.components['c3_matcher']"
".disk_weights_path is None; the runtime root MUST populate the "
"DISK engine path before constructing this strategy."
)
if clock is None:
from gps_denied_onboard.clock.wall_clock import WallClock
clock = WallClock()
if logger is None:
logger = logging.getLogger("gps_denied_onboard.c3_matcher.disk_lightglue")
cache_entry = inference_runtime.compile_engine(
weights_path, _build_disk_build_config()
)
entry_for_deserialize = type(cache_entry)(
engine_path=cache_entry.engine_path,
sha256_hex=cache_entry.sha256_hex,
sm=cache_entry.sm,
jp=cache_entry.jp,
trt=cache_entry.trt,
precision=cache_entry.precision,
extras={**cache_entry.extras, "model_name": MODEL_NAME},
)
engine_handle = inference_runtime.deserialize_engine(entry_for_deserialize)
logger.info(
_LOG_KIND_READY,
extra={
"kind": _LOG_KIND_READY,
"kv": {
"strategy": _MATCHER_LABEL,
"min_inliers_threshold": int(block.min_inliers_threshold),
"residual_warn_threshold_px": float(
block.residual_warn_threshold_px
),
"ransac_threshold_px": float(block.ransac_threshold_px),
},
},
)
return DiskLightGlueMatcher(
config,
lightglue_runtime=lightglue_runtime,
ransac_filter=ransac_filter,
inference_runtime=inference_runtime,
health_window=health_window,
engine_handle=engine_handle,
clock=clock,
fdr_client=fdr_client,
logger=logger,
)
@@ -0,0 +1,76 @@
"""C3's structural cut of C7 ``InferenceRuntime`` (AZ-507).
Concrete C3 :class:`CrossDomainMatcher` impls call into C7's inference
runtime to load engine handles (DISK / ALIKED / XFeat backbones) and
run forward passes. Per AZ-507, ``c3_matcher`` MUST NOT import
``components.c7_inference`` directly; the consumer-side cut declares
the structural Protocol surface that c3 actually uses, and the
composition root binds the c7 runtime as the concrete implementation.
This Protocol mirrors the subset of
:class:`gps_denied_onboard.components.c7_inference.InferenceRuntime`
that the C3 strategies consume — ``compile_engine``,
``deserialize_engine``, ``infer``, ``release_engine``, and
``current_runtime_label``. The full Protocol (which adds
``thermal_state``) is wider; the cut narrows to what C3 needs so
``isinstance(runtime, InferenceRuntimeCut)`` can be enforced without
demanding the wider surface.
Mirrors :mod:`gps_denied_onboard.components.c2_vpr.inference_runtime_cut`
1:1 — both components consume the same five-method surface of C7. The
duplication is deliberate: each component owns the structural cut it
needs (single-responsibility), so a future divergence in one consumer
does not silently widen the other's contract.
DTOs (``BuildConfig``, ``EngineHandle``, ``EngineCacheEntry``) live in
:mod:`gps_denied_onboard._types.inference` (L1) and are imported here
directly — they are L1 shared types, not cross-component imports.
Tile-pixel handles consumed via :class:`RerankCandidate.tile_pixels_handle`
are intentionally typed ``object`` at the rerank L1 boundary (the C6
TileStore owns the concrete type). C3 strategies treat them as
context managers yielding a JPEG memoryview (``with handle as
jpeg_view: ...``). No Protocol class is exported for this: the surface
is two methods (``__enter__`` / ``__exit__``) and ``object`` plus a
documented duck-type is the lightest contract that preserves the
architectural import boundary.
"""
from __future__ import annotations
from pathlib import Path
from typing import TYPE_CHECKING, Literal, Protocol, runtime_checkable
from gps_denied_onboard._types.inference import (
BuildConfig,
EngineCacheEntry,
EngineHandle,
)
if TYPE_CHECKING:
import numpy as np
__all__ = ["InferenceRuntimeCut"]
@runtime_checkable
class InferenceRuntimeCut(Protocol):
"""Subset of C7 ``InferenceRuntime`` consumed by C3 strategies."""
def compile_engine(
self, model_path: Path, build_config: BuildConfig
) -> EngineCacheEntry: ...
def deserialize_engine(self, entry: EngineCacheEntry) -> EngineHandle: ...
def infer(
self,
handle: EngineHandle,
inputs: dict[str, "np.ndarray"],
) -> dict[str, "np.ndarray"]: ...
def release_engine(self, handle: EngineHandle) -> None: ...
def current_runtime_label(
self,
) -> Literal["tensorrt", "onnx_trt_ep", "pytorch_fp16"]: ...
@@ -0,0 +1,544 @@
"""``XFeatMatcher`` — C3 lightweight alternate CrossDomainMatcher (AZ-347).
XFeat fuses feature extraction and matching into a single forward
pass — no separate LightGlue stage. The strategy is the lightweight
fallback for thermal-throttled scenarios and the mandated simple-
baseline at the matcher level (AC-2.1a engine rule applied
analogously to NetVLAD at C2).
Per AZ-347 Constraint, ``LightGlueRuntime`` is accepted in the
factory signature for uniformity with the DISK/ALIKED factories but
the strategy NEITHER stores nor calls it (AC-special-1). The factory
parameter exists so the AZ-344 :func:`build_matcher_strategy` can
invoke all three concrete ``create(...)`` functions with the same
args.
The XFeat preprocessor lives next to the strategy with hard-coded
parameters (Constraint — weights-coupled, same rule as DISK/ALIKED).
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Final, Literal
import cv2
import numpy as np
from gps_denied_onboard._types.inference import (
BuildConfig,
PrecisionMode,
)
from gps_denied_onboard._types.matcher import CandidateMatchSet, MatchResult
from gps_denied_onboard.components.c3_matcher._pipeline import (
TileExtractError,
_emit_backbone_error,
_emit_frame_done,
_emit_insufficient_inliers,
_fail_all,
decode_bgr_image,
)
from gps_denied_onboard.components.c3_matcher.errors import InsufficientInliersError
from gps_denied_onboard.components.c3_matcher.inference_runtime_cut import (
InferenceRuntimeCut,
)
from gps_denied_onboard.helpers.ransac_filter import RansacFilterError
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.matcher import MatcherHealth
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.rerank import RerankCandidate, RerankResult
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c3_matcher._health_window import (
RollingHealthWindow,
)
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrClient
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntime
from gps_denied_onboard.helpers.ransac_filter import RansacFilter
__all__ = ["MODEL_NAME", "XFeatMatcher", "create"]
MODEL_NAME: Final[str] = "xfeat"
_MATCHER_LABEL: Final[Literal["xfeat"]] = "xfeat"
_FDR_PRODUCER_ID: Final[str] = "c3_matcher.xfeat"
_LOG_KIND_READY: Final[str] = "c3.matcher.ready"
_LOG_KIND_ZERO_INLIERS: Final[str] = "c3.matcher.zero_inliers"
# XFeat input contract: NCHW float32 grayscale at 384×384, normalised
# to [0, 1]. Hard-coded per Constraint — weights-coupled.
_INPUT_SIZE: Final[int] = 384
_QUERY_INPUT_KEY: Final[str] = "query"
_TILE_INPUT_KEY: Final[str] = "tile"
_OUTPUT_CORRESPONDENCES: Final[str] = "correspondences"
class XFeatMatcher:
"""XFeat single-pass cross-domain matcher.
Stateless per-frame except for the rolling health window. No
``LightGlueRuntime`` reference — see ``AC-special-1``.
"""
def __init__(
self,
config: "Config",
*,
ransac_filter: type["RansacFilter"],
inference_runtime: InferenceRuntimeCut,
health_window: "RollingHealthWindow",
engine_handle: object,
clock: "Clock",
fdr_client: "FdrClient | None",
logger: logging.Logger,
) -> None:
block = config.components["c3_matcher"]
self._config = config
self._ransac_filter = ransac_filter
self._inference_runtime = inference_runtime
self._engine_handle = engine_handle
self._health_window = health_window
self._clock = clock
self._fdr_client = fdr_client
self._logger = logger
self._min_inliers_threshold: int = int(block.min_inliers_threshold)
self._residual_warn_threshold_px: float = float(
block.residual_warn_threshold_px
)
self._ransac_threshold_px: float = float(block.ransac_threshold_px)
# AC-11: NO LightGlueRuntime reference. We intentionally do
# NOT define ``self._lightglue_runtime`` here so the test can
# assert ``not hasattr(strategy, "_lightglue_runtime")``.
def match(
self,
frame: "NavCameraFrame",
rerank_result: "RerankResult",
calibration: "CameraCalibration",
) -> "MatchResult":
del calibration
candidates_input = len(rerank_result.candidates)
frame_id = int(rerank_result.frame_id)
survivors: list[CandidateMatchSet] = []
dropped = 0
query_image = decode_bgr_image(frame.image)
if query_image is None:
_emit_backbone_error(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
tile_id=None,
phase="query_extract",
error_type="ImageDecodeError",
error_message="NavCameraFrame.image could not be decoded",
)
_fail_all(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
health_window=self._health_window,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=candidates_input,
now_ns=int(self._clock.monotonic_ns()),
)
try:
query_tensor = _preprocess_xfeat(query_image)
except TileExtractError as exc:
_emit_backbone_error(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
tile_id=None,
phase="query_extract",
error_type=type(exc).__name__,
error_message=str(exc),
)
_fail_all(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
health_window=self._health_window,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=candidates_input,
now_ns=int(self._clock.monotonic_ns()),
)
for candidate in rerank_result.candidates:
survivor = self._process_candidate(
candidate=candidate,
query_tensor=query_tensor,
frame_id=frame_id,
)
if survivor is None:
dropped += 1
continue
survivors.append(survivor)
now_ns = int(self._clock.monotonic_ns())
had_backbone_error = dropped > 0
if not survivors:
_fail_all(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
health_window=self._health_window,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=dropped,
now_ns=now_ns,
had_backbone_error=had_backbone_error,
)
survivors.sort(
key=lambda s: (-s.inlier_count, s.per_candidate_residual_px)
)
max_inliers = survivors[0].inlier_count
if max_inliers < self._min_inliers_threshold:
_emit_insufficient_inliers(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=dropped,
max_inlier_count=max_inliers,
)
self._health_window.update(
timestamp_ns=now_ns,
best_inlier_count=0,
had_backbone_error=had_backbone_error,
)
raise InsufficientInliersError(
f"{_MATCHER_LABEL}.match: every survivor's inlier_count is "
f"below min_inliers_threshold={self._min_inliers_threshold} "
f"(max_inlier_count={max_inliers}, frame_id={frame_id})"
)
best = survivors[0]
if best.per_candidate_residual_px > self._residual_warn_threshold_px:
self._logger.warning(
"c3.matcher.residual_above_threshold",
extra={
"kind": "c3.matcher.residual_above_threshold",
"kv": {
"frame_id": frame_id,
"matcher_label": _MATCHER_LABEL,
"residual_px": float(best.per_candidate_residual_px),
"threshold_px": float(self._residual_warn_threshold_px),
},
},
)
self._health_window.update(
timestamp_ns=now_ns,
best_inlier_count=int(best.inlier_count),
had_backbone_error=had_backbone_error,
)
result = MatchResult(
frame_id=frame_id,
per_candidate=tuple(survivors),
best_candidate_idx=0,
reprojection_residual_px=float(best.per_candidate_residual_px),
matched_at=now_ns,
matcher_label=_MATCHER_LABEL,
candidates_input=candidates_input,
candidates_dropped=dropped,
)
_emit_frame_done(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
result=result,
)
return result
def health_snapshot(self) -> "MatcherHealth":
return self._health_window.snapshot()
def _process_candidate(
self,
*,
candidate: "RerankCandidate",
query_tensor: np.ndarray,
frame_id: int,
) -> CandidateMatchSet | None:
tile_id = candidate.tile_id
handle = candidate.tile_pixels_handle
try:
with handle as jpeg_view:
tile_image = _decode_tile_jpeg(jpeg_view)
except (ValueError, AttributeError, TypeError) as exc:
_emit_backbone_error(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
tile_id=tile_id,
phase="tile_decode",
error_type=type(exc).__name__,
error_message=str(exc),
)
return None
try:
tile_tensor = _preprocess_xfeat(tile_image)
except TileExtractError as exc:
_emit_backbone_error(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
tile_id=tile_id,
phase="tile_extract",
error_type=type(exc).__name__,
error_message=str(exc),
)
return None
try:
outputs = self._inference_runtime.infer(
self._engine_handle,
{_QUERY_INPUT_KEY: query_tensor, _TILE_INPUT_KEY: tile_tensor},
)
except Exception as exc:
_emit_backbone_error(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
tile_id=tile_id,
phase="xfeat_forward",
error_type=type(exc).__name__,
error_message=str(exc),
)
return None
if _OUTPUT_CORRESPONDENCES not in outputs:
_emit_backbone_error(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
tile_id=tile_id,
phase="xfeat_forward",
error_type="KeyError",
error_message=(
f"XFeat forward missing {_OUTPUT_CORRESPONDENCES!r}; "
f"got keys={sorted(outputs.keys())!r}"
),
)
return None
raw = np.asarray(outputs[_OUTPUT_CORRESPONDENCES], dtype=np.float32)
if raw.ndim != 2 or raw.shape[1] != 4 or raw.shape[0] < 4:
self._logger.debug(
_LOG_KIND_ZERO_INLIERS,
extra={
"kind": _LOG_KIND_ZERO_INLIERS,
"kv": {
"frame_id": frame_id,
"matcher_label": _MATCHER_LABEL,
"tile_id": list(tile_id),
"correspondences": int(raw.shape[0]) if raw.ndim == 2 else 0,
},
},
)
return None
try:
ransac_result = self._ransac_filter.filter_correspondences(
raw,
self._ransac_threshold_px,
self._min_inliers_threshold,
)
except RansacFilterError as exc:
_emit_backbone_error(
logger=self._logger,
fdr_client=self._fdr_client,
fdr_producer_id=_FDR_PRODUCER_ID,
clock=self._clock,
matcher_label=_MATCHER_LABEL,
frame_id=frame_id,
tile_id=tile_id,
phase="ransac",
error_type=type(exc).__name__,
error_message=str(exc),
)
return None
if ransac_result.inlier_count == 0:
self._logger.debug(
_LOG_KIND_ZERO_INLIERS,
extra={
"kind": _LOG_KIND_ZERO_INLIERS,
"kv": {
"frame_id": frame_id,
"matcher_label": _MATCHER_LABEL,
"tile_id": list(tile_id),
},
},
)
return None
inliers = np.ascontiguousarray(
ransac_result.inlier_correspondences, dtype=np.float32
)
return CandidateMatchSet(
tile_id=tile_id,
inlier_count=int(ransac_result.inlier_count),
inlier_correspondences=inliers,
ransac_outlier_count=int(ransac_result.outlier_count),
per_candidate_residual_px=float(ransac_result.median_residual_px),
)
def _preprocess_xfeat(image_bgr: np.ndarray) -> np.ndarray:
"""BGR → grayscale 384×384 float32 NCHW in ``[0, 1]``.
Hard-coded weights-coupled preprocessing.
"""
if image_bgr.ndim == 3:
gray = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2GRAY)
elif image_bgr.ndim == 2:
gray = image_bgr
else:
raise TileExtractError(
f"XFeat preprocessor: image must be 2-D (gray) or 3-D (BGR); "
f"got ndim={image_bgr.ndim} shape={image_bgr.shape}"
)
if gray.shape[0] != _INPUT_SIZE or gray.shape[1] != _INPUT_SIZE:
gray = cv2.resize(
gray, (_INPUT_SIZE, _INPUT_SIZE), interpolation=cv2.INTER_AREA
)
if gray.dtype != np.float32:
gray = gray.astype(np.float32) / 255.0
else:
gray = gray / 255.0
return gray.reshape(1, 1, _INPUT_SIZE, _INPUT_SIZE).astype(np.float32, copy=False)
def _decode_tile_jpeg(jpeg_view: memoryview) -> np.ndarray:
data = bytes(jpeg_view)
if not data:
raise ValueError("empty JPEG buffer")
buf = np.frombuffer(data, dtype=np.uint8)
decoded = cv2.imdecode(buf, cv2.IMREAD_COLOR)
if decoded is None:
raise ValueError("cv2.imdecode returned None for tile JPEG")
return decoded
def _build_xfeat_build_config() -> BuildConfig:
return BuildConfig(
precision=PrecisionMode.FP16,
workspace_mb=256,
calibration_dataset=None,
optimization_profiles=(),
)
def create(
config: "Config",
*,
lightglue_runtime: "LightGlueRuntime",
ransac_filter: type["RansacFilter"],
inference_runtime: InferenceRuntimeCut,
health_window: "RollingHealthWindow",
clock: "Clock | None" = None,
fdr_client: "FdrClient | None" = None,
logger: logging.Logger | None = None,
) -> "XFeatMatcher":
"""Module-level factory consumed by :func:`build_matcher_strategy`.
``lightglue_runtime`` is accepted for factory uniformity (Constraint
— AZ-347) and discarded. XFeat is single-pass and does not consume
the LightGlue helper.
"""
del lightglue_runtime # AC-special-1: accepted for uniformity, never stored/used
block = config.components["c3_matcher"]
weights_path = block.xfeat_weights_path
if weights_path is None:
raise ValueError(
"XFeatMatcher.create: config.components['c3_matcher']"
".xfeat_weights_path is None; the runtime root MUST populate the "
"XFeat engine path before constructing this strategy."
)
if clock is None:
from gps_denied_onboard.clock.wall_clock import WallClock
clock = WallClock()
if logger is None:
logger = logging.getLogger("gps_denied_onboard.c3_matcher.xfeat")
cache_entry = inference_runtime.compile_engine(
weights_path, _build_xfeat_build_config()
)
entry_for_deserialize = type(cache_entry)(
engine_path=cache_entry.engine_path,
sha256_hex=cache_entry.sha256_hex,
sm=cache_entry.sm,
jp=cache_entry.jp,
trt=cache_entry.trt,
precision=cache_entry.precision,
extras={**cache_entry.extras, "model_name": MODEL_NAME},
)
engine_handle = inference_runtime.deserialize_engine(entry_for_deserialize)
logger.info(
_LOG_KIND_READY,
extra={
"kind": _LOG_KIND_READY,
"kv": {
"strategy": _MATCHER_LABEL,
"min_inliers_threshold": int(block.min_inliers_threshold),
"residual_warn_threshold_px": float(
block.residual_warn_threshold_px
),
"ransac_threshold_px": float(block.ransac_threshold_px),
},
},
)
return XFeatMatcher(
config,
ransac_filter=ransac_filter,
inference_runtime=inference_runtime,
health_window=health_window,
engine_handle=engine_handle,
clock=clock,
fdr_client=fdr_client,
logger=logger,
)