[AZ-343] C2.5 InlierCountReRanker + shared FeatureExtractor helper

Implements the production-default ReRankStrategy: K=10 → N=3 by
single-pair LightGlue inlier count, with strict drop-and-continue
(INV-8) on per-candidate TileFetch / backbone / zero-inlier failures
and RerankAllCandidatesFailedError on zero survivors. Composition
root injects the shared LightGlueRuntime + Clock + the new
FeatureExtractor helper (an L1 placeholder OpenCvOrbExtractor that
unblocks AZ-343 and future C3 strategies — task scope expansion).

Architectural notes:
- Cross-component imports stay banned; tile_store types as `object`
  and the C6 TileCacheError family is duck-typed by class module
  prefix (same workaround AZ-348 adopted for c7_inference; proper
  fix is to relocate TileCacheError to _types/ in a follow-up).
- Clock injection follows the replay contract (AZ-398 Invariant 2);
  reranked_at is sourced from clock.monotonic_ns().
- AZ-342 factory grew `feature_extractor` + `clock` + `fdr_client`
  parameters; existing AZ-342 conformance tests updated.

Tests: 19 new AC-1..AC-12 + mixed-failure scenarios in
test_inlier_count_reranker.py; existing AZ-342 suite (26) still
green. Full repo sweep 1093 passed / 2 skipped (cmake/actionlint
not on PATH).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-12 06:22:40 +03:00
parent 9a605c8514
commit 48ea1e2fc2
10 changed files with 1739 additions and 13 deletions
@@ -4,11 +4,12 @@
**Purpose**: re-rank C2's top-K=10 VPR candidates down to top-N=3 by single-pair LightGlue inlier count, producing a higher-precision input for the cross-domain matcher (C3). The re-rank step is the architectural boundary between cheap descriptor retrieval (C2) and expensive cross-domain matching (C3) — it pays a small extra cost so C3 only operates on the most promising candidates.
**Architectural Pattern**: Strategy (single concrete implementation today: `InlierCountReRanker`). Future re-rank algorithms can be added as additional `ReRankStrategy` implementations behind the same interface.
**Architectural Pattern**: Strategy (single concrete implementation today: `InlierCountReRanker`, AZ-343). Future re-rank algorithms can be added as additional `ReRankStrategy` implementations behind the same interface.
**Upstream dependencies**:
- C2 → `VprResult` (top-K=10 candidates).
- Shared `LightGlueRuntime` helper (used in single-pair mode for inlier counting; the same matcher object is shared with C3 — owned by the helper, not by C3, so neither component depends on the other at build time).
- Shared `FeatureExtractor` helper (`helpers/feature_extractor.py`, AZ-343 scope expansion) — extracts `KeypointSet` from both the per-frame nav image and each candidate's tile JPEG; the placeholder impl is `OpenCvOrbExtractor`, swapped out for a TRT-backed deep extractor before flight.
- C6 TileStore → fetch tile pixels for each candidate (cheap, in-memory page-cache hit during a flight).
- Camera calibration artifact — for nav-frame preprocessing.
@@ -59,7 +60,7 @@ No caching layer beyond C6's mmap. The same tile may be fetched repeatedly acros
**Algorithmic Complexity**: `O(K)` LightGlue forward passes per frame (K=10), each `O(M_tile · M_query)` in feature counts. The whole step is GPU-bound on the same engine that C3 uses — hence the shared LightGlue runtime.
**State Management**: stateless per-frame. Holds a reference to the shared LightGlue object owned by C3.
**State Management**: stateless per-frame. Holds references to the constructor-injected `LightGlueRuntime`, `FeatureExtractor`, `TileStore`, `Clock`, and (optionally) `FdrClient` — all lifecycle-owned by the runtime root, not by C2.5.
**Key Dependencies**:
@@ -78,6 +79,8 @@ No caching layer beyond C6's mmap. The same tile may be fetched repeatedly acros
| Helper | Purpose | Used By |
|--------|---------|---------|
| `LightGlueRuntime` | shared LightGlue inference handle (one engine, many call sites) | C2.5, C3 |
| `FeatureExtractor` (`helpers/feature_extractor.py`) | shared image → `KeypointSet` extractor; default `OpenCvOrbExtractor`, target TRT-backed DISK/ALIKED | C2.5, future C3 backbones |
| `Clock` (`gps_denied_onboard.clock`) | composition-root time source; stamps `RerankResult.reranked_at` via `clock.monotonic_ns()` (Invariant 2 of the replay contract — no direct `time.*` in components) | every C* component |
## 7. Caveats & Edge Cases
+13 -4
View File
@@ -57,12 +57,14 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
- **Epic**: AZ-256 (E-C2.5 Rerank)
- **Directory**: `src/gps_denied_onboard/components/c2_5_rerank/`
- **Public API**:
- `__init__.py` (re-exports `ReRankStrategy`, `RerankResult`, `RerankCandidate`)
- `__init__.py` (re-exports `ReRankStrategy`, `RerankResult`, `RerankCandidate`, `RerankError` family, `C2_5RerankConfig`)
- `interface.py` (`ReRankStrategy` Protocol)
- `config.py` (`C2_5RerankConfig` dataclass; registered on import; `strategy`, `top_n`, `debug_per_frame_log` fields)
- `errors.py` (`RerankError`, `RerankBackboneError`, `RerankAllCandidatesFailedError`)
- **Internal**:
- `inlier_based_reranker.py` (single-pair LightGlue inlier count K=10→N=3)
- **Owns**: `src/gps_denied_onboard/components/c2_5_rerank/**`, `tests/unit/c2_5_rerank/**`
- **Imports from**: `_types`, `helpers.lightglue_runtime`, `helpers.descriptor_normaliser`, `helpers.ransac_filter`, `helpers.se3_utils`, `components.c6_tile_cache` (Public API), `components.c7_inference`, `config`, `logging`, `fdr_client`
- `inlier_based_reranker.py` (`InlierCountReRanker`single-pair LightGlue inlier count K=10→N=3, AZ-343; module-level `create()` factory entry-point consumed by `runtime_root.rerank_factory.build_rerank_strategy`; gated by `BUILD_RERANK_INLIER_COUNT`)
- **Owns**: `src/gps_denied_onboard/components/c2_5_rerank/**`, `src/gps_denied_onboard/runtime_root/rerank_factory.py`, `tests/unit/c2_5_rerank/**`
- **Imports from**: `_types`, `helpers.lightglue_runtime`, `helpers.feature_extractor` (AZ-343 scope expansion), `helpers.descriptor_normaliser`, `helpers.ransac_filter`, `helpers.se3_utils`, `components.c6_tile_cache` (Public API only — `TileStore`, `TilePixelHandle`, `TileCacheError` family), `clock`, `config`, `logging`, `fdr_client`
- **Consumed by**: `c3_matcher`, `runtime_root`
### Component: c3_matcher
@@ -289,6 +291,13 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
- **Owned by**: AZ-264.
- **Consumed by**: c2_5_rerank, c3_matcher.
### shared/helpers/feature_extractor
- **Directory**: `src/gps_denied_onboard/helpers/feature_extractor.py`
- **Purpose**: Shared image → `KeypointSet` Protocol + placeholder `OpenCvOrbExtractor` impl (AZ-343 scope expansion). Lets every consumer that feeds `LightGlueRuntime.match` reach for the SAME extractor (same descriptor distribution, same `descriptor_dim`) without each strategy reinventing its own preprocessing.
- **Owned by**: AZ-343.
- **Consumed by**: c2_5_rerank (today via `InlierCountReRanker`), c3_matcher (future concrete strategies in AZ-345 / AZ-346 / AZ-347).
### shared/helpers/wgs_converter
- **Directory**: `src/gps_denied_onboard/helpers/wgs_converter.py`
+1 -1
View File
@@ -8,7 +8,7 @@ status: in_progress
sub_step:
phase: 2
name: detect-progress
detail: "batch 24 complete (AZ-336, AZ-342, AZ-344, AZ-348 per-task commits); ready to plan batch 25"
detail: "batch 25 in progress: AZ-343 implemented (InlierCountReRanker + shared FeatureExtractor helper, 19 AZ-343 tests + 26 AZ-342 tests green, full sweep 1093 passed/2 skipped); pending AZ-345, AZ-332"
retry_count: 0
cycle: 1
tracker: jira
@@ -37,10 +37,18 @@ class C2_5RerankConfig:
``top_n`` is the per-frame N cap (1..K-1). Default 3 (the epic's
K=10 → N=3 spec).
``debug_per_frame_log`` gates the two DEBUG events
(``c2_5.rerank.zero_inliers`` per dropped candidate and
``c2_5.rerank.frame_done`` per frame); flooding journald at
``3 Hz × K=10 = 30 events/sec`` by default would violate
description.md § 9. Operators flip this to ``True`` for the
debug-build flight binary.
"""
strategy: str = "inlier_count"
top_n: int = 3
debug_per_frame_log: bool = False
def __post_init__(self) -> None:
if self.strategy not in KNOWN_STRATEGIES:
@@ -0,0 +1,584 @@
"""C2.5 :class:`InlierCountReRanker` — single-pair LightGlue inlier count (AZ-343).
Production-default :class:`ReRankStrategy` for the K=10 → N=3 cut.
For each candidate in :class:`VprResult.candidates`:
1. Fetch tile pixels via :class:`TileStore.read_tile_pixels` (a
:class:`TilePixelHandle` context manager backed by an mmap'd JPEG).
2. Extract :class:`KeypointSet` from BOTH the query frame and the
candidate tile via the shared :class:`FeatureExtractor` (AZ-343
scope expansion).
3. Call :meth:`LightGlueRuntime.match` for the single pair; count the
number of correspondences as the inlier proxy.
4. Sort surviving candidates descending by ``inlier_count`` (ties
broken ascending by ``descriptor_distance`` carried forward from
C2; INV-3); truncate to ``n``; return a :class:`RerankResult`.
Drop-and-continue (INV-8) is the central reliability mechanism: any
per-candidate :class:`TileCacheError` or LightGlue / feature-extractor
failure is caught inside the loop, the candidate is dropped, an ERROR
log + FDR record is emitted, and the loop continues. Only the
zero-survivors case escapes as :class:`RerankAllCandidatesFailedError`.
The survivor's ``tile_pixels_handle`` is identity-equal to the handle
returned by ``TileStore.read_tile_pixels`` (INV-6 / AC-7). The handle
is exited at the end of feature extraction; downstream C3 re-enters it
to read pixels — the C6 page-cache-backed impl supports re-entry for
the per-frame TTL window.
"""
from __future__ import annotations
import logging
from datetime import datetime, timezone
from typing import TYPE_CHECKING
import cv2
import numpy as np
from gps_denied_onboard._types.rerank import RerankCandidate, RerankResult
from gps_denied_onboard.components.c2_5_rerank.errors import (
RerankAllCandidatesFailedError,
RerankBackboneError,
)
from gps_denied_onboard.fdr_client import FdrRecord
from gps_denied_onboard.helpers.feature_extractor import FeatureExtractorError
from gps_denied_onboard.helpers.lightglue_runtime import (
LightGlueConcurrentAccessError,
LightGlueRuntimeError,
)
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.matching import KeypointSet
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.vpr import VprResult
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrClient
from gps_denied_onboard.helpers.feature_extractor import FeatureExtractor
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntime
# Cross-component types (`TileStore`, `ReRankStrategy`,
# `C2_5RerankConfig`) are intentionally NOT imported here — even under
# ``TYPE_CHECKING``, an AST-level cross-component import is rejected by
# ``test_ac6_only_compose_root_imports_concrete_strategies``. The
# composition root (``runtime_root.rerank_factory``) injects concrete
# instances satisfying these Protocols; we accept them as ``object``
# at the constructor boundary and trust the runtime root for type
# safety.
__all__ = ["InlierCountReRanker", "create"]
_LOG = logging.getLogger("gps_denied_onboard.c2_5_rerank")
_PRODUCER_ID = "c2_5_rerank.inlier_count"
# C6 TileCacheError lives in `gps_denied_onboard.components.c6_tile_cache.errors`
# but we cannot import it: cross-component imports are banned outside the
# composition root (test_ac6_only_compose_root_imports_concrete_strategies).
# Match the family by class-module prefix instead — the C6 contract documents
# the module path so a future module-rename surfaces as a test failure here.
_C6_ERROR_MODULE_PREFIX = "gps_denied_onboard.components.c6_tile_cache"
def _is_tile_cache_error(exc: BaseException) -> bool:
"""True if ``exc`` is a C6 :class:`TileCacheError` subclass.
Duck-types against the producer's class module to keep the
architectural import boundary clean. Programming errors raised
from the C6 module (e.g. ``AttributeError``) would also match —
that is acceptable, since by Contract C6 wraps OS errors into
:class:`TileCacheError`; anything bare leaking out is a C6 bug
that the per-candidate drop semantics will absorb just as the
contract expects of any per-candidate failure.
"""
return type(exc).__module__.startswith(_C6_ERROR_MODULE_PREFIX)
class InlierCountReRanker:
"""Single-pair LightGlue inlier-count :class:`ReRankStrategy` (AZ-343)."""
def __init__(
self,
*,
config: Config,
tile_store: object,
lightglue_runtime: LightGlueRuntime,
feature_extractor: FeatureExtractor,
clock: Clock,
fdr_client: FdrClient | None,
) -> None:
# Keyword-only injection: a runtime-root regression that forgets
# one of the helpers fails loudly instead of silently constructing
# an under-wired strategy. ``tile_store`` is typed ``object``
# because the C6 ``TileStore`` Protocol lives in another
# component (see the module docstring on cross-component imports).
block = config.components["c2_5_rerank"]
self._tile_store = tile_store
self._lightglue_runtime = lightglue_runtime
self._feature_extractor = feature_extractor
self._clock = clock
self._fdr_client = fdr_client
self._top_n = int(block.top_n)
self._debug_per_frame_log = bool(block.debug_per_frame_log)
def rerank(
self,
frame: NavCameraFrame,
vpr_result: VprResult,
n: int,
calibration: CameraCalibration,
) -> RerankResult:
candidates_input = len(vpr_result.candidates)
target_n = self._top_n if n <= 0 else min(self._top_n, n)
if candidates_input == 0:
self._fail_all(
frame_id=vpr_result.frame_id,
candidates_input=0,
candidates_dropped=0,
reason="no_input_candidates",
)
query_features = self._extract_query_features(frame)
if query_features is None:
self._fail_all(
frame_id=vpr_result.frame_id,
candidates_input=candidates_input,
candidates_dropped=candidates_input,
reason="query_extraction_failed",
)
survivors: list[RerankCandidate] = []
dropped = 0
inlier_counts: list[int] = []
for vpr_candidate in vpr_result.candidates:
tile_id = vpr_candidate.tile_id
survivor = self._process_candidate(
tile_id=tile_id,
vpr_candidate=vpr_candidate,
query_features=query_features,
frame_id=vpr_result.frame_id,
)
if survivor is None:
dropped += 1
continue
survivors.append(survivor)
inlier_counts.append(survivor.inlier_count)
if not survivors:
self._fail_all(
frame_id=vpr_result.frame_id,
candidates_input=candidates_input,
candidates_dropped=dropped,
reason="all_candidates_dropped",
)
survivors.sort(key=lambda c: (-c.inlier_count, c.descriptor_distance))
truncated = tuple(survivors[:target_n])
if len(truncated) < target_n:
_LOG.warning(
"c2_5.rerank.fewer_than_n_survivors",
extra={
"kind": "c2_5.rerank.fewer_than_n_survivors",
"kv": {
"requested": target_n,
"returned": len(truncated),
"dropped": dropped,
"frame_id": vpr_result.frame_id,
},
},
)
if self._debug_per_frame_log:
_LOG.debug(
"c2_5.rerank.frame_done",
extra={
"kind": "c2_5.rerank.frame_done",
"kv": {
"frame_id": vpr_result.frame_id,
"inlier_counts": inlier_counts,
},
},
)
result = RerankResult(
frame_id=vpr_result.frame_id,
candidates=truncated,
reranked_at=int(self._clock.monotonic_ns()),
rerank_label="inlier_count",
candidates_input=candidates_input,
candidates_dropped=dropped,
)
self._emit_frame_done_fdr(result)
return result
# ------------------------------------------------------------------
# Per-candidate pipeline: open handle → extract → match → score.
def _process_candidate(
self,
*,
tile_id,
vpr_candidate,
query_features,
frame_id: int,
) -> RerankCandidate | None:
try:
handle = self._tile_store.read_tile_pixels(tile_id)
except Exception as exc:
if not _is_tile_cache_error(exc):
raise
self._log_tile_fetch_error(tile_id=tile_id, frame_id=frame_id, exc=exc)
return None
tile_features = self._extract_tile_features(
handle=handle, tile_id=tile_id, frame_id=frame_id
)
if tile_features is None:
return None
inlier_count = self._count_inliers(
query_features=query_features,
tile_features=tile_features,
tile_id=tile_id,
frame_id=frame_id,
)
if inlier_count is None:
return None
if inlier_count == 0:
self._maybe_log_zero_inliers(tile_id=tile_id, frame_id=frame_id)
return None
return RerankCandidate(
tile_id=tile_id,
inlier_count=inlier_count,
descriptor_distance=vpr_candidate.descriptor_distance,
descriptor_dim=vpr_candidate.descriptor_dim,
tile_pixels_handle=handle,
)
def _extract_query_features(
self, frame: NavCameraFrame
) -> KeypointSet | None:
image = _ensure_bgr_array(frame.image)
if image is None:
self._log_backbone_error(
frame_id=frame.frame_id,
tile_id=None,
reason="query_image_not_decodable",
error=None,
)
return None
try:
return self._feature_extractor.extract(image)
except FeatureExtractorError as exc:
self._log_backbone_error(
frame_id=frame.frame_id,
tile_id=None,
reason="query_feature_extraction_failed",
error=exc,
)
return None
def _extract_tile_features(
self, *, handle, tile_id, frame_id: int
) -> KeypointSet | None:
try:
with handle as jpeg_view:
tile_image = _decode_jpeg(jpeg_view)
except ValueError as exc:
self._log_tile_fetch_error(tile_id=tile_id, frame_id=frame_id, exc=exc)
return None
except Exception as exc:
if not _is_tile_cache_error(exc):
raise
self._log_tile_fetch_error(tile_id=tile_id, frame_id=frame_id, exc=exc)
return None
try:
return self._feature_extractor.extract(tile_image)
except FeatureExtractorError as exc:
self._log_backbone_error(
frame_id=frame_id,
tile_id=tile_id,
reason="tile_feature_extraction_failed",
error=exc,
)
return None
def _count_inliers(
self,
*,
query_features,
tile_features,
tile_id,
frame_id: int,
) -> int | None:
try:
correspondences = self._lightglue_runtime.match(
query_features, tile_features
)
except (
LightGlueRuntimeError,
LightGlueConcurrentAccessError,
RerankBackboneError,
RuntimeError,
) as exc:
self._log_backbone_error(
frame_id=frame_id,
tile_id=tile_id,
reason="lightglue_forward_failed",
error=exc,
)
return None
scores = getattr(correspondences, "scores", None)
if scores is None:
return 0
try:
return int(np.asarray(scores).shape[0])
except (TypeError, ValueError):
return 0
# ------------------------------------------------------------------
# Log + FDR helpers.
def _log_tile_fetch_error(self, *, tile_id, frame_id: int, exc: BaseException) -> None:
_LOG.error(
"c2_5.rerank.tile_fetch_error",
extra={
"kind": "c2_5.rerank.tile_fetch_error",
"kv": {
"frame_id": frame_id,
"tile_id": list(tile_id),
"error": repr(exc),
},
},
)
if self._fdr_client is None:
return
self._safe_enqueue(
FdrRecord(
schema_version=1,
ts=self._fdr_ts(),
producer_id=_PRODUCER_ID,
kind="rerank.tile_fetch_error",
payload={
"frame_id": int(frame_id),
"tile_id": list(tile_id),
},
)
)
def _log_backbone_error(
self,
*,
frame_id: int,
tile_id,
reason: str,
error: BaseException | None,
) -> None:
kv: dict[str, object] = {"frame_id": frame_id, "reason": reason}
if tile_id is not None:
kv["tile_id"] = list(tile_id)
if error is not None:
kv["error"] = repr(error)
_LOG.error(
"c2_5.rerank.backbone_error",
extra={"kind": "c2_5.rerank.backbone_error", "kv": kv},
)
if self._fdr_client is None:
return
payload: dict[str, object] = {
"frame_id": int(frame_id),
"reason": reason,
}
if tile_id is not None:
payload["tile_id"] = list(tile_id)
self._safe_enqueue(
FdrRecord(
schema_version=1,
ts=self._fdr_ts(),
producer_id=_PRODUCER_ID,
kind="rerank.backbone_error",
payload=payload,
)
)
def _maybe_log_zero_inliers(self, *, tile_id, frame_id: int) -> None:
if not self._debug_per_frame_log:
return
_LOG.debug(
"c2_5.rerank.zero_inliers",
extra={
"kind": "c2_5.rerank.zero_inliers",
"kv": {"frame_id": frame_id, "tile_id": list(tile_id)},
},
)
def _emit_frame_done_fdr(self, result: RerankResult) -> None:
if self._fdr_client is None:
return
top = result.candidates[0]
self._safe_enqueue(
FdrRecord(
schema_version=1,
ts=self._fdr_ts(),
producer_id=_PRODUCER_ID,
kind="rerank.frame_done",
payload={
"frame_id": int(result.frame_id),
"candidates_input": int(result.candidates_input),
"candidates_dropped": int(result.candidates_dropped),
"top_inlier_count": int(top.inlier_count),
"top_tile_id": list(top.tile_id),
},
)
)
def _emit_all_failed_fdr(
self, *, frame_id: int, candidates_input: int, candidates_dropped: int
) -> None:
if self._fdr_client is None:
return
self._safe_enqueue(
FdrRecord(
schema_version=1,
ts=self._fdr_ts(),
producer_id=_PRODUCER_ID,
kind="rerank.all_failed",
payload={
"frame_id": int(frame_id),
"candidates_input": int(candidates_input),
"candidates_dropped": int(candidates_dropped),
},
)
)
def _fail_all(
self,
*,
frame_id: int,
candidates_input: int,
candidates_dropped: int,
reason: str,
) -> None:
_LOG.error(
"c2_5.rerank.all_failed",
extra={
"kind": "c2_5.rerank.all_failed",
"kv": {
"frame_id": frame_id,
"candidates_input": candidates_input,
"candidates_dropped": candidates_dropped,
"reason": reason,
},
},
)
self._emit_all_failed_fdr(
frame_id=frame_id,
candidates_input=candidates_input,
candidates_dropped=candidates_dropped,
)
raise RerankAllCandidatesFailedError(
f"InlierCountReRanker.rerank: zero survivors "
f"(frame_id={frame_id!r}, candidates_input={candidates_input}, "
f"candidates_dropped={candidates_dropped}, reason={reason!r})"
)
def _safe_enqueue(self, record: FdrRecord) -> None:
try:
self._fdr_client.enqueue(record) # type: ignore[union-attr]
except Exception as exc:
# FDR enqueue failures are observability-only; they must
# NEVER promote to an InlierCountReRanker drop event.
_LOG.debug(
"c2_5.rerank.fdr_enqueue_failed",
extra={
"kind": "c2_5.rerank.fdr_enqueue_failed",
"kv": {"error": repr(exc)},
},
)
def _fdr_ts(self) -> str:
ns = int(self._clock.time_ns())
seconds, fraction_ns = divmod(ns, 1_000_000_000)
dt = datetime.fromtimestamp(seconds, tz=timezone.utc)
# ISO-8601 with nanosecond fractional part and an explicit UTC
# offset; survives a round-trip through datetime.fromisoformat
# (which accepts up to microseconds — the extra ns digits are
# preserved as a string suffix for the FDR consumer).
return f"{dt.strftime('%Y-%m-%dT%H:%M:%S')}.{fraction_ns:09d}+00:00"
def _ensure_bgr_array(image: object) -> np.ndarray | None:
"""Coerce ``NavCameraFrame.image`` into a BGR ``np.ndarray``.
Accepts an already-decoded array (returned as-is) or a JPEG/PNG
byte buffer (decoded via ``cv2.imdecode``). Anything else returns
``None`` so the caller routes through the backbone-error drop path.
"""
if isinstance(image, np.ndarray):
return image
if isinstance(image, (bytes, bytearray, memoryview)):
data = bytes(image)
if not data:
return None
buf = np.frombuffer(data, dtype=np.uint8)
return cv2.imdecode(buf, cv2.IMREAD_COLOR)
return None
def _decode_jpeg(jpeg_view: memoryview) -> np.ndarray:
"""Decode a JPEG ``memoryview`` into a BGR ``np.ndarray``.
Raises :class:`ValueError` if the buffer is empty or invalid; the
caller catches both and treats them as a tile-fetch-error drop.
"""
data = bytes(jpeg_view)
if not data:
raise ValueError("empty JPEG buffer")
buf = np.frombuffer(data, dtype=np.uint8)
decoded = cv2.imdecode(buf, cv2.IMREAD_COLOR)
if decoded is None:
raise ValueError("cv2.imdecode returned None for tile JPEG")
return decoded
# ----------------------------------------------------------------------
# Module-level factory entry-point consumed by
# :mod:`gps_denied_onboard.runtime_root.rerank_factory.build_rerank_strategy`.
def create(
config: Config,
*,
tile_store: object,
lightglue_runtime: LightGlueRuntime,
feature_extractor: FeatureExtractor,
clock: Clock,
fdr_client: FdrClient | None = None,
) -> object:
"""Construct an :class:`InlierCountReRanker` from injected helpers."""
strategy = InlierCountReRanker(
config=config,
tile_store=tile_store,
lightglue_runtime=lightglue_runtime,
feature_extractor=feature_extractor,
clock=clock,
fdr_client=fdr_client,
)
_LOG.info(
"c2_5.rerank.ready",
extra={
"kind": "c2_5.rerank.ready",
"kv": {
"strategy": "inlier_count",
"N": int(config.components["c2_5_rerank"].top_n),
"K": 10,
},
},
)
return strategy
@@ -0,0 +1,159 @@
"""`FeatureExtractor` — shared image → :class:`KeypointSet` helper (AZ-343 scope expansion).
L1 helper analogous to :mod:`gps_denied_onboard.helpers.lightglue_runtime`
and :mod:`gps_denied_onboard.helpers.ransac_filter`. Produces a
:class:`gps_denied_onboard._types.matching.KeypointSet` (the same
DTO that :class:`LightGlueRuntime.match` consumes) from a raw BGR
image.
Why a shared helper:
- C2.5 :class:`InlierCountReRanker` (AZ-343) consumes one
:class:`FeatureExtractor` instance to extract features from each
per-frame nav-camera image AND from each candidate tile's JPEG
bytes. The same instance MUST produce comparable feature sets
for both inputs — otherwise the LightGlue inlier count would
collapse to noise.
- A future C3 backbone that wants to share keypoints with C2.5
(rather than re-extracting them) can read the same handle from
the composition root, mirroring the
:class:`LightGlueRuntime` ownership pattern (R14 fix).
Concrete impls:
- :class:`OpenCvOrbExtractor`: CPU, deterministic, placeholder used
by tests and by the airborne binary until the C7
:class:`InferenceRuntime`-backed DISK / ALIKED extractor lands.
ORB returns binary (``uint8``) descriptors of 32 bytes; we
convert to ``float32`` per the
:class:`gps_denied_onboard._types.matching.KeypointSet` contract.
- Future: TensorRT-backed DISK / ALIKED extractor; consumes
:class:`InferenceRuntime` from C7.
This helper is intentionally L1 — it imports only ``numpy`` and
``cv2`` plus the L1 :class:`KeypointSet` DTO. Concrete strategies
that need GPU backbones live in their own modules and accept the
:class:`InferenceRuntime` via constructor injection.
"""
from __future__ import annotations
from typing import Protocol, runtime_checkable
import cv2
import numpy as np
from gps_denied_onboard._types.matching import KeypointSet
__all__ = [
"FeatureExtractor",
"FeatureExtractorError",
"OpenCvOrbExtractor",
]
# ORB descriptors are 32 bytes (256 bits). LightGlue's KeypointSet
# requires float32 descriptors so we widen ORB's uint8 output. This is
# a placeholder choice; production will swap in DISK/ALIKED (128-d
# float32) via the C7 InferenceRuntime path.
_ORB_DESCRIPTOR_BYTES = 32
_ORB_FLOAT_DESCRIPTOR_DIM = _ORB_DESCRIPTOR_BYTES * 8 # 256-d float32
class FeatureExtractorError(RuntimeError):
"""Raised on extractor construction or per-image failure."""
@runtime_checkable
class FeatureExtractor(Protocol):
"""Image → :class:`KeypointSet` Protocol.
Implementations are constructor-injected by the composition root
and shared across consumers (e.g., C2.5 :class:`InlierCountReRanker`
uses one instance for both query frames and tile pixels).
Invariants:
- :meth:`extract` returns a :class:`KeypointSet` whose
``descriptors.shape[1] == self.descriptor_dim()``.
- ``keypoints`` is shape ``(N, 2)`` ``float32`` pixel coordinates.
- ``descriptors`` is shape ``(N, descriptor_dim)`` ``float32``.
- Empty inputs (zero keypoints detected) return an empty-but-shaped
:class:`KeypointSet` (``N == 0``) rather than raising — the
C2.5 strategy treats zero-feature candidates as drop events.
- Deterministic for fixed inputs (no internal RNG state).
"""
def extract(self, image_bgr: np.ndarray) -> KeypointSet:
"""Detect keypoints + compute descriptors on a single image."""
...
def descriptor_dim(self) -> int:
"""Return the dim of every descriptor row produced by :meth:`extract`."""
...
class OpenCvOrbExtractor:
"""CPU :class:`FeatureExtractor` backed by ``cv2.ORB_create``.
Placeholder implementation: ORB is fast (~5 ms / 480p image on a
modern CPU) and stable enough to exercise the C2.5 strategy's
orchestration logic, but its uint8 binary descriptors are NOT a
drop-in for LightGlue-trained DISK/ALIKED features. Production
deployments MUST replace this extractor with a deep-learning
backbone before flight (tracked under the future C2.5
backbone-extractor task).
The ``nfeatures`` constructor arg caps the number of keypoints
per image; default 1024 mirrors typical DISK / ALIKED budgets.
"""
def __init__(self, *, nfeatures: int = 1024) -> None:
if nfeatures < 1:
raise FeatureExtractorError(
f"OpenCvOrbExtractor.nfeatures must be >= 1; got {nfeatures}"
)
self._nfeatures: int = nfeatures
# ORB itself is created lazily so test environments without
# a working OpenCV install can still import this module.
# Cached on first call to amortise the per-image cost.
self._orb: cv2.ORB | None = None
def descriptor_dim(self) -> int:
return _ORB_FLOAT_DESCRIPTOR_DIM
def _get_orb(self) -> cv2.ORB:
if self._orb is None:
self._orb = cv2.ORB_create(nfeatures=self._nfeatures)
return self._orb
def extract(self, image_bgr: np.ndarray) -> KeypointSet:
if image_bgr.ndim == 3:
gray = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2GRAY)
elif image_bgr.ndim == 2:
gray = image_bgr
else:
raise FeatureExtractorError(
"image_bgr must be 2-D (gray) or 3-D (BGR); "
f"got ndim={image_bgr.ndim} shape={image_bgr.shape}"
)
if gray.dtype != np.uint8:
gray = gray.astype(np.uint8)
try:
keypoints_cv, descriptors_uint8 = self._get_orb().detectAndCompute(
gray, mask=None
)
except cv2.error as exc:
raise FeatureExtractorError(f"cv2.ORB.detectAndCompute failed: {exc}") from exc
if descriptors_uint8 is None or len(keypoints_cv) == 0:
keypoints = np.zeros((0, 2), dtype=np.float32)
descriptors = np.zeros((0, _ORB_FLOAT_DESCRIPTOR_DIM), dtype=np.float32)
return KeypointSet(keypoints=keypoints, descriptors=descriptors)
keypoints = np.array(
[(kp.pt[0], kp.pt[1]) for kp in keypoints_cv], dtype=np.float32
)
# Expand each 32-byte ORB descriptor to a 256-d float32 vector
# of bit indicators (0/1). Matches the contract that
# ``KeypointSet.descriptors`` is float32.
bits = np.unpackbits(descriptors_uint8, axis=1).astype(np.float32)
return KeypointSet(keypoints=keypoints, descriptors=bits)
@@ -28,12 +28,15 @@ from typing import TYPE_CHECKING
from gps_denied_onboard.runtime_root.errors import StrategyNotAvailableError
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.components.c2_5_rerank import (
C2_5RerankConfig,
ReRankStrategy,
)
from gps_denied_onboard.components.c6_tile_cache import TileStore
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrClient
from gps_denied_onboard.helpers.feature_extractor import FeatureExtractor
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntime
__all__ = ["build_rerank_strategy"]
@@ -71,6 +74,9 @@ def build_rerank_strategy(
*,
tile_store: "TileStore",
lightglue_runtime: "LightGlueRuntime",
feature_extractor: "FeatureExtractor",
clock: "Clock",
fdr_client: "FdrClient | None" = None,
) -> "ReRankStrategy":
"""Construct the :class:`ReRankStrategy` impl selected by config.
@@ -79,15 +85,27 @@ def build_rerank_strategy(
raises :class:`StrategyNotAvailableError` BEFORE any import.
3. Lazily imports the concrete strategy module.
4. Constructs the strategy via its module-level
``create(config, tile_store, lightglue_runtime)`` factory
function (each concrete strategy module exports ``create`` as
its public entry-point; concrete constructors stay private).
``create(config, tile_store, lightglue_runtime, feature_extractor, fdr_client)``
factory function (each concrete strategy module exports
``create`` as its public entry-point; concrete constructors
stay private).
5. Emits ONE INFO log ``kind="c2_5.rerank.strategy_loaded"`` with
structured fields ``{strategy, top_n}``.
``feature_extractor`` is a shared L1 helper (AZ-343 scope
expansion) used by the concrete strategy to extract keypoints +
descriptors from each per-frame nav image AND from each
candidate's tile pixels. ``clock`` is the composition-root
:class:`Clock` (AZ-398) — strategies stamp
:attr:`RerankResult.reranked_at` via ``clock.monotonic_ns()``
rather than calling stdlib ``time`` directly (Invariant 2 of
the replay contract). ``fdr_client`` is optional — passed
through to strategies that emit FDR records; ``None`` lets the
strategy run without FDR emission (useful for tests).
Raises:
StrategyNotAvailableError: compile-time flag OFF or
concrete module not yet built (AZ-343 pending).
concrete module not yet built.
"""
block = _c2_5_config(config)
strategy = block.strategy
@@ -128,12 +146,18 @@ def build_rerank_strategy(
config,
tile_store=tile_store,
lightglue_runtime=lightglue_runtime,
feature_extractor=feature_extractor,
clock=clock,
fdr_client=fdr_client,
)
else:
instance = create_fn(
config,
tile_store=tile_store,
lightglue_runtime=lightglue_runtime,
feature_extractor=feature_extractor,
clock=clock,
fdr_client=fdr_client,
)
_LOG.info(
"c2_5.rerank.strategy_loaded",
@@ -0,0 +1,889 @@
"""AZ-343 — :class:`InlierCountReRanker` acceptance + NFR coverage.
Covers AC-1..AC-12 from the task spec at
``_docs/02_tasks/todo/AZ-343_c2_5_inlier_count_reranker.md``.
Performance NFR (C2.5-PT-01 p95 ≤ 80 ms for 10 single-pair LightGlue
passes against the real TRT engine) is deferred to Step 9 / E-BBT per
the task's "Excluded" section — the harness here uses test doubles
that bypass real GPU work.
"""
from __future__ import annotations
import logging
from dataclasses import dataclass
import numpy as np
import pytest
from gps_denied_onboard._types.matching import CorrespondenceSet, KeypointSet
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.rerank import RerankResult
from gps_denied_onboard._types.vpr import VprCandidate, VprResult
from gps_denied_onboard.components.c2_5_rerank import (
C2_5RerankConfig,
RerankAllCandidatesFailedError,
ReRankStrategy,
)
from gps_denied_onboard.components.c2_5_rerank.inlier_based_reranker import (
InlierCountReRanker,
create,
)
from gps_denied_onboard.components.c6_tile_cache import TilePixelHandle
from gps_denied_onboard.components.c6_tile_cache.errors import TileNotFoundError
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client import FdrRecord
from gps_denied_onboard.helpers.feature_extractor import FeatureExtractorError
from gps_denied_onboard.helpers.lightglue_runtime import LightGlueRuntimeError
# ----------------------------------------------------------------------
# Test doubles
@dataclass
class _FakeClock:
_t: int = 1_700_000_000_000_000_000
def monotonic_ns(self) -> int:
self._t += 1
return self._t
def time_ns(self) -> int:
return self._t
def sleep_until_ns(self, target_ns: int) -> None:
return None
class _FakeTilePixelHandle(TilePixelHandle):
"""Reusable :class:`TilePixelHandle` — supports multi-shot ``with`` blocks.
The buffer is mutable so AC-7 can prove identity (mutation through
one ``with`` block must be visible through the next).
"""
def __init__(self, jpeg_bytes: bytes, path):
self._buf = bytearray(jpeg_bytes)
self._path = path
@property
def filesystem_path(self):
return self._path
def __enter__(self) -> memoryview:
return memoryview(self._buf)
def __exit__(self, exc_type, exc_val, exc_tb) -> None:
return None
def mutate(self, new_bytes: bytes) -> None:
self._buf = bytearray(new_bytes)
def _synthesise_jpeg(seed: int) -> bytes:
"""Produce a deterministic colour JPEG keyed off ``seed``."""
import cv2
rng = np.random.default_rng(seed)
image = rng.integers(0, 255, size=(32, 32, 3), dtype=np.uint8)
ok, buf = cv2.imencode(".jpg", image)
assert ok, "cv2.imencode failed in test fixture"
return bytes(buf)
class _FakeTileStore:
"""Returns deterministic handles per ``tile_id``; can be told to fail."""
def __init__(self):
from pathlib import Path
self._handles: dict[tuple, _FakeTilePixelHandle] = {}
self._fail: set[tuple] = set()
self._path_base = Path("/tmp/c2_5_rerank_fake")
def install(self, tile_id, *, fail: bool = False, jpeg_seed: int | None = None) -> None:
if fail:
self._fail.add(tile_id)
return
if jpeg_seed is None:
jpeg_seed = hash(tile_id) & 0xFFFF
self._handles[tile_id] = _FakeTilePixelHandle(
jpeg_bytes=_synthesise_jpeg(jpeg_seed),
path=self._path_base / f"{tile_id}.jpg",
)
def handle(self, tile_id) -> _FakeTilePixelHandle:
return self._handles[tile_id]
def read_tile_pixels(self, tile_id):
if tile_id in self._fail:
raise TileNotFoundError(f"fake: {tile_id} marked as failing")
return self._handles[tile_id]
def write_tile(self, tile_blob, metadata):
raise NotImplementedError
def tile_exists(self, tile_id):
return tile_id in self._handles
def delete_tile(self, tile_id):
return self._handles.pop(tile_id, None) is not None
class _FakeFeatureExtractor:
"""Returns a deterministic :class:`KeypointSet` per image; can fail."""
def __init__(self) -> None:
self._fail_calls: set[int] = set()
self._call_count = 0
def fail_on(self, call_index: int) -> None:
self._fail_calls.add(call_index)
def descriptor_dim(self) -> int:
return 256
def extract(self, image_bgr: np.ndarray) -> KeypointSet:
idx = self._call_count
self._call_count += 1
if idx in self._fail_calls:
raise FeatureExtractorError(f"fake extractor failing on call {idx}")
return KeypointSet(
keypoints=np.zeros((4, 2), dtype=np.float32),
descriptors=np.zeros((4, 256), dtype=np.float32),
)
class _ProgrammableLightGlue:
"""Returns the next pre-programmed :class:`CorrespondenceSet`; can raise."""
def __init__(self) -> None:
self._calls: list[
tuple[KeypointSet, KeypointSet]
] = []
self._results: list[object] = [] # CorrespondenceSet | Exception
def queue_inliers(self, count: int) -> None:
self._results.append(_make_correspondence_set(count))
def queue_error(self, exc: BaseException) -> None:
self._results.append(exc)
def descriptor_dim(self) -> int:
return 256
def match(self, features_a: KeypointSet, features_b: KeypointSet) -> CorrespondenceSet:
self._calls.append((features_a, features_b))
if not self._results:
raise AssertionError(
"fake LightGlue ran out of programmed responses; queue more"
)
result = self._results.pop(0)
if isinstance(result, BaseException):
raise result
return result
def match_batch(self, features_a_list, features_b_list):
raise NotImplementedError
@property
def calls(self) -> list[tuple[KeypointSet, KeypointSet]]:
return self._calls
class _CapturingFdrClient:
def __init__(self) -> None:
self.records: list[FdrRecord] = []
def enqueue(self, record: FdrRecord) -> None:
self.records.append(record)
def _make_correspondence_set(count: int) -> CorrespondenceSet:
return CorrespondenceSet(
correspondences=np.zeros((count, 4), dtype=np.float32),
scores=np.full((count,), 0.5, dtype=np.float32),
)
def _make_frame(frame_id: int = 7) -> NavCameraFrame:
from datetime import datetime, timezone
image = (np.random.default_rng(frame_id).integers(0, 255, (16, 16, 3))).astype(
np.uint8
)
return NavCameraFrame(
frame_id=frame_id,
timestamp=datetime.now(tz=timezone.utc),
image=image,
camera_calibration_id="cam0",
)
def _make_vpr_candidate(*, tile_id, distance: float) -> VprCandidate:
return VprCandidate(tile_id=tile_id, descriptor_distance=distance, descriptor_dim=256)
def _make_vpr_result(*, frame_id: int, candidates: list[VprCandidate]) -> VprResult:
return VprResult(
frame_id=frame_id,
candidates=tuple(candidates),
retrieved_at=10,
backbone_label="ultra_vpr",
)
def _build_reranker(
*,
tile_store: _FakeTileStore,
extractor: _FakeFeatureExtractor,
lightglue: _ProgrammableLightGlue,
fdr_client=None,
top_n: int = 3,
debug_per_frame_log: bool = False,
) -> InlierCountReRanker:
config = Config.with_blocks(
c2_5_rerank=C2_5RerankConfig(
strategy="inlier_count",
top_n=top_n,
debug_per_frame_log=debug_per_frame_log,
)
)
return InlierCountReRanker(
config=config,
tile_store=tile_store,
lightglue_runtime=lightglue,
feature_extractor=extractor,
clock=_FakeClock(),
fdr_client=fdr_client,
)
def _install_k_candidates(
tile_store: _FakeTileStore,
*,
k: int,
distances: list[float] | None = None,
fail_indices: set[int] | None = None,
) -> list[VprCandidate]:
distances = distances or [0.1 * i for i in range(k)]
fail_indices = fail_indices or set()
candidates: list[VprCandidate] = []
for i in range(k):
tile_id = (18, 49.0 + i * 0.001, 36.0 + i * 0.001)
tile_store.install(tile_id, fail=i in fail_indices, jpeg_seed=i)
candidates.append(_make_vpr_candidate(tile_id=tile_id, distance=distances[i]))
return candidates
# ----------------------------------------------------------------------
# Calibration fixture (the strategy ignores it for now — Protocol shape only).
@pytest.fixture
def calibration():
from gps_denied_onboard._types.calibration import CameraCalibration
return CameraCalibration(
camera_id="cam0",
intrinsics_3x3=np.eye(3, dtype=np.float32),
distortion=np.zeros((5,), dtype=np.float32),
body_to_camera_se3=np.eye(4, dtype=np.float32),
acquisition_method="synthetic",
)
# ----------------------------------------------------------------------
# AC-1 — Protocol conformance.
def test_ac1_isinstance_rerank_strategy(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=_ProgrammableLightGlue(),
)
# Assert
assert isinstance(reranker, ReRankStrategy)
assert hasattr(reranker, "rerank")
# ----------------------------------------------------------------------
# AC-2 — top-N ordering with mixed inlier counts + ties + zeros.
def test_ac2_top_n_ordering_and_tie_break(calibration) -> None:
# Arrange
inlier_counts = [412, 198, 287, 153, 287, 0, 65, 412, 89, 234]
descriptor_distances = [0.1, 0.4, 0.2, 0.3, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
tile_store = _FakeTileStore()
candidates = _install_k_candidates(
tile_store, k=10, distances=descriptor_distances
)
lightglue = _ProgrammableLightGlue()
for count in inlier_counts:
lightglue.queue_inliers(count)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
top_n=3,
)
vpr_result = _make_vpr_result(frame_id=12, candidates=candidates)
# Act
result = reranker.rerank(_make_frame(12), vpr_result, n=3, calibration=calibration)
# Assert
assert len(result.candidates) == 3
assert result.candidates[0].inlier_count == 412
assert result.candidates[0].descriptor_distance == pytest.approx(0.1)
assert result.candidates[1].inlier_count == 412
assert result.candidates[1].descriptor_distance == pytest.approx(0.8)
assert result.candidates[2].inlier_count == 287
assert result.candidates[2].descriptor_distance == pytest.approx(0.2)
# Zero-inlier candidate is dropped; candidates_dropped accounts for it.
assert result.candidates_dropped >= 1
# ----------------------------------------------------------------------
# AC-3 — drop-and-continue on LightGlue failure.
def test_ac3_drop_and_continue_on_backbone_error(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
for i in range(10):
if i == 3:
lightglue.queue_error(LightGlueRuntimeError("boom"))
else:
lightglue.queue_inliers(100 + i)
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=21, candidates=candidates)
# Act
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_5_rerank"):
result = reranker.rerank(
_make_frame(21), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 3
assert result.candidates_dropped >= 1
backbone_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.backbone_error"
]
assert len(backbone_errors) == 1
assert getattr(backbone_errors[0], "kv", {}).get("reason") == "lightglue_forward_failed"
backbone_fdr = [r for r in fdr.records if r.kind == "rerank.backbone_error"]
assert len(backbone_fdr) == 1
# ----------------------------------------------------------------------
# AC-4 — drop-and-continue on TileStore failure.
def test_ac4_drop_and_continue_on_tile_fetch_error(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10, fail_indices={6})
lightglue = _ProgrammableLightGlue()
for _ in range(9):
lightglue.queue_inliers(200)
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=42, candidates=candidates)
# Act
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_5_rerank"):
result = reranker.rerank(
_make_frame(42), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 3
assert result.candidates_dropped >= 1
tile_fetch_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.tile_fetch_error"
]
assert len(tile_fetch_errors) == 1
tile_fetch_fdr = [r for r in fdr.records if r.kind == "rerank.tile_fetch_error"]
assert len(tile_fetch_fdr) == 1
# ----------------------------------------------------------------------
# AC-5 — zero survivors raises RerankAllCandidatesFailedError.
def test_ac5_zero_survivors_raises(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
for _ in range(10):
lightglue.queue_error(LightGlueRuntimeError("everything-fails"))
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=99, candidates=candidates)
# Act / Assert
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_5_rerank"):
with pytest.raises(RerankAllCandidatesFailedError):
reranker.rerank(
_make_frame(99), vpr_result, n=3, calibration=calibration
)
backbone_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.backbone_error"
]
assert len(backbone_errors) == 10
all_failed = [
r for r in caplog.records if r.message == "c2_5.rerank.all_failed"
]
assert len(all_failed) == 1
all_failed_fdr = [r for r in fdr.records if r.kind == "rerank.all_failed"]
assert len(all_failed_fdr) == 1
payload = all_failed_fdr[0].payload
assert payload["candidates_input"] == 10
assert payload["candidates_dropped"] == 10
# ----------------------------------------------------------------------
# AC-6 — fewer than N survivors → WARN log + partial result.
def test_ac6_fewer_than_n_survivors_warn(calibration, caplog) -> None:
# Arrange — 8 fail, 2 succeed.
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
# Two succeed, six fail with LightGlueRuntimeError, two return zero inliers.
success_indices = {0, 5}
zero_indices = {2, 8}
for i in range(10):
if i in success_indices:
lightglue.queue_inliers(300 + i)
elif i in zero_indices:
lightglue.queue_inliers(0)
else:
lightglue.queue_error(LightGlueRuntimeError("bad"))
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
)
vpr_result = _make_vpr_result(frame_id=55, candidates=candidates)
# Act
with caplog.at_level(logging.WARNING, logger="gps_denied_onboard.c2_5_rerank"):
result = reranker.rerank(
_make_frame(55), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 2
assert result.candidates_dropped == 8
warn_records = [
r for r in caplog.records if r.message == "c2_5.rerank.fewer_than_n_survivors"
]
assert len(warn_records) == 1
assert getattr(warn_records[0], "kv", {}).get("requested") == 3
assert getattr(warn_records[0], "kv", {}).get("returned") == 2
assert getattr(warn_records[0], "kv", {}).get("dropped") == 8
# ----------------------------------------------------------------------
# AC-7 — tile_pixels_handle is a reference, not a copy.
def test_ac7_tile_pixels_handle_is_reference(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=3)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(500)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
top_n=3,
)
vpr_result = _make_vpr_result(frame_id=1, candidates=candidates)
# Act
result = reranker.rerank(
_make_frame(1), vpr_result, n=3, calibration=calibration
)
# Assert — identity preservation against the TileStore-returned handle.
for survivor in result.candidates:
original = tile_store.handle(survivor.tile_id)
assert survivor.tile_pixels_handle is original
# ----------------------------------------------------------------------
# AC-8 — descriptor_distance carried forward unchanged.
def test_ac8_descriptor_distance_preserved(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
distance = 0.123456789
candidates = _install_k_candidates(
tile_store, k=3, distances=[distance, 0.2, 0.3]
)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(700)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
)
vpr_result = _make_vpr_result(frame_id=2, candidates=candidates)
# Act
result = reranker.rerank(
_make_frame(2), vpr_result, n=3, calibration=calibration
)
# Assert
top_tile = candidates[0].tile_id
matching = [c for c in result.candidates if c.tile_id == top_tile]
assert matching
assert matching[0].descriptor_distance == distance
# ----------------------------------------------------------------------
# AC-9 — deterministic same-inputs → bit-identical RerankResult.candidates.
def test_ac9_deterministic_candidates(calibration) -> None:
# Arrange — single reranker instance called three times so the
# injected clock advances between calls (AC-9: reranked_at MUST
# differ across calls but candidates MUST NOT).
counts = [40, 90, 70, 10, 60, 30, 80, 20, 50, 100]
distances = [0.1 * i for i in range(10)]
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10, distances=distances)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
for c in counts:
lightglue.queue_inliers(c)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
)
vpr_result = _make_vpr_result(frame_id=314, candidates=candidates)
# Act
runs: list[RerankResult] = [
reranker.rerank(_make_frame(314), vpr_result, n=3, calibration=calibration)
for _ in range(3)
]
# Assert
triples = [
tuple((c.tile_id, c.inlier_count, c.descriptor_distance) for c in r.candidates)
for r in runs
]
assert triples[0] == triples[1] == triples[2]
# reranked_at differs across calls because Clock.monotonic_ns advances.
assert runs[0].reranked_at != runs[1].reranked_at
assert runs[1].reranked_at != runs[2].reranked_at
# ----------------------------------------------------------------------
# AC-10 — composition-root wiring via the AZ-342 factory.
def test_ac10_composition_root_wiring(monkeypatch, caplog) -> None:
# Arrange — reuse the module already imported at file top so the
# class identity matches; the factory's lazy import picks it up
# from sys.modules unchanged.
monkeypatch.setenv("BUILD_RERANK_INLIER_COUNT", "ON")
from gps_denied_onboard.runtime_root.rerank_factory import build_rerank_strategy
config = Config.with_blocks(
c2_5_rerank=C2_5RerankConfig(strategy="inlier_count", top_n=3)
)
tile_store = _FakeTileStore()
extractor = _FakeFeatureExtractor()
lightglue = _ProgrammableLightGlue()
clock = _FakeClock()
# Act
with caplog.at_level(logging.INFO, logger="gps_denied_onboard.c2_5_rerank"):
instance = build_rerank_strategy(
config,
tile_store=tile_store,
lightglue_runtime=lightglue,
feature_extractor=extractor,
clock=clock,
)
# Assert
assert isinstance(instance, InlierCountReRanker)
assert isinstance(instance, ReRankStrategy)
assert instance._lightglue_runtime is lightglue
ready_logs = [r for r in caplog.records if r.message == "c2_5.rerank.ready"]
assert len(ready_logs) == 1
kv = getattr(ready_logs[0], "kv", {})
assert kv.get("strategy") == "inlier_count"
assert kv.get("N") == 3
assert kv.get("K") == 10
# ----------------------------------------------------------------------
# AC-11 — FDR rerank.frame_done emission per frame.
def test_ac11_frame_done_fdr_emission(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
successes = [412, 287, 198] + [10] * 7 # top three survive ranking.
for c in successes:
lightglue.queue_inliers(c)
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=77, candidates=candidates)
# Act
result = reranker.rerank(
_make_frame(77), vpr_result, n=3, calibration=calibration
)
# Assert
frame_done = [r for r in fdr.records if r.kind == "rerank.frame_done"]
assert len(frame_done) == 1
payload = frame_done[0].payload
assert payload["frame_id"] == 77
assert payload["candidates_input"] == 10
assert payload["top_inlier_count"] == result.candidates[0].inlier_count
# ----------------------------------------------------------------------
# AC-12 — single-pair LightGlue invocation count.
def test_ac12_single_pair_lightglue_called_exactly_k_times(calibration) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10)
lightglue = _ProgrammableLightGlue()
for i in range(10):
lightglue.queue_inliers(10 + i)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
)
vpr_result = _make_vpr_result(frame_id=88, candidates=candidates)
# Act
reranker.rerank(_make_frame(88), vpr_result, n=3, calibration=calibration)
# Assert
assert len(lightglue.calls) == 10
first_query = lightglue.calls[0][0]
for query, _ in lightglue.calls[1:]:
assert query is first_query
# ----------------------------------------------------------------------
# Mixed drop-and-continue smoke (Risk-1 / Risk-2 coverage).
def test_drop_and_continue_mixed_failures(calibration, caplog) -> None:
# Arrange — 1 TileFetch failure, 1 LightGlue failure, 2 zero-inliers, 6 succeed.
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=10, fail_indices={2})
lightglue = _ProgrammableLightGlue()
# Index 2 is dropped at tile fetch; the remaining 9 indices feed LightGlue.
counts_for_remaining = [50, 75, 25, 100, 0, 80, 90, 0] # 8 entries for indices 0,1,3,4,5,6,7,8
# Index 9 hits a LightGlue error.
plan: list[object] = []
rem_iter = iter(counts_for_remaining)
for i in range(10):
if i == 2:
continue
if i == 9:
plan.append(LightGlueRuntimeError("backbone-died"))
else:
plan.append(next(rem_iter))
for item in plan:
if isinstance(item, Exception):
lightglue.queue_error(item)
else:
lightglue.queue_inliers(item)
fdr = _CapturingFdrClient()
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=fdr,
)
vpr_result = _make_vpr_result(frame_id=66, candidates=candidates)
# Act
with caplog.at_level(logging.ERROR, logger="gps_denied_onboard.c2_5_rerank"):
result = reranker.rerank(
_make_frame(66), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 3
# 1 tile-fetch + 1 backbone + 2 zero-inliers = 4 drops.
assert result.candidates_dropped == 4
backbone_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.backbone_error"
]
assert len(backbone_errors) == 1
tile_fetch_errors = [
r for r in caplog.records if r.message == "c2_5.rerank.tile_fetch_error"
]
assert len(tile_fetch_errors) == 1
assert any(r.kind == "rerank.backbone_error" for r in fdr.records)
assert any(r.kind == "rerank.tile_fetch_error" for r in fdr.records)
assert any(r.kind == "rerank.frame_done" for r in fdr.records)
# ----------------------------------------------------------------------
# Public API — ``InlierCountReRanker`` stays out of c2_5_rerank.__all__ (AC-8).
def test_inlier_count_reranker_not_publicly_re_exported() -> None:
# Arrange / Act
from gps_denied_onboard.components import c2_5_rerank
# Assert
assert "InlierCountReRanker" not in c2_5_rerank.__all__
# ----------------------------------------------------------------------
# Module-level create() is the factory entry-point (Outcome step 5).
def test_create_returns_inlier_count_reranker() -> None:
# Arrange
config = Config.with_blocks(
c2_5_rerank=C2_5RerankConfig(strategy="inlier_count", top_n=3)
)
# Act
instance = create(
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_ProgrammableLightGlue(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
# Assert
assert isinstance(instance, InlierCountReRanker)
assert isinstance(instance, ReRankStrategy)
# ----------------------------------------------------------------------
# Health: no_input_candidates short-circuit also raises.
def test_zero_input_candidates_short_circuits(calibration) -> None:
# Arrange
reranker = _build_reranker(
tile_store=_FakeTileStore(),
extractor=_FakeFeatureExtractor(),
lightglue=_ProgrammableLightGlue(),
)
vpr_result = _make_vpr_result(frame_id=5, candidates=[])
# Act / Assert
with pytest.raises(RerankAllCandidatesFailedError):
reranker.rerank(
_make_frame(5), vpr_result, n=3, calibration=calibration
)
# ----------------------------------------------------------------------
# DEBUG gating — per-frame frame_done DEBUG only fires when configured on.
def test_debug_per_frame_log_gated_off_by_default(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=3)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(100)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
debug_per_frame_log=False,
)
vpr_result = _make_vpr_result(frame_id=33, candidates=candidates)
# Act
with caplog.at_level(logging.DEBUG, logger="gps_denied_onboard.c2_5_rerank"):
reranker.rerank(_make_frame(33), vpr_result, n=3, calibration=calibration)
# Assert
debug_records = [
r for r in caplog.records if r.message == "c2_5.rerank.frame_done"
]
assert debug_records == []
def test_debug_per_frame_log_emits_when_enabled(calibration, caplog) -> None:
# Arrange
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=3)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(100)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
debug_per_frame_log=True,
)
vpr_result = _make_vpr_result(frame_id=34, candidates=candidates)
# Act
with caplog.at_level(logging.DEBUG, logger="gps_denied_onboard.c2_5_rerank"):
reranker.rerank(_make_frame(34), vpr_result, n=3, calibration=calibration)
# Assert
debug_records = [
r for r in caplog.records if r.message == "c2_5.rerank.frame_done"
]
assert len(debug_records) == 1
# ----------------------------------------------------------------------
# FDR enqueue failures must NEVER promote to drop events (observability-only).
def test_fdr_enqueue_failure_is_swallowed(calibration) -> None:
# Arrange
class _BrokenFdr:
def enqueue(self, record):
raise RuntimeError("queue broken")
tile_store = _FakeTileStore()
candidates = _install_k_candidates(tile_store, k=3)
lightglue = _ProgrammableLightGlue()
for _ in range(3):
lightglue.queue_inliers(100)
reranker = _build_reranker(
tile_store=tile_store,
extractor=_FakeFeatureExtractor(),
lightglue=lightglue,
fdr_client=_BrokenFdr(),
)
vpr_result = _make_vpr_result(frame_id=99, candidates=candidates)
# Act
result = reranker.rerank(
_make_frame(99), vpr_result, n=3, calibration=calibration
)
# Assert
assert len(result.candidates) == 3
@@ -70,11 +70,46 @@ class _FakeLightGlueRuntime:
raise NotImplementedError
class _FakeFeatureExtractor:
def descriptor_dim(self):
return 256
def extract(self, image_bgr):
raise NotImplementedError
class _FakeClock:
def __init__(self) -> None:
self._t = 1_000_000_000
def monotonic_ns(self):
self._t += 1
return self._t
def time_ns(self):
return self._t
def sleep_until_ns(self, target_ns):
return None
class _FullReRankStrategy:
def __init__(self, config, *, tile_store, lightglue_runtime) -> None:
def __init__(
self,
config,
*,
tile_store,
lightglue_runtime,
feature_extractor=None,
clock=None,
fdr_client=None,
) -> None:
self._config = config
self._tile_store = tile_store
self._lightglue_runtime = lightglue_runtime
self._feature_extractor = feature_extractor
self._clock = clock
self._fdr_client = fdr_client
self._label = config.components["c2_5_rerank"].strategy
def rerank(self, frame, vpr_result, n, calibration):
@@ -127,6 +162,7 @@ def test_ac1_rerank_strategy_conformance_full() -> None:
_config_with_strategy(),
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
clock=_FakeClock(),
)
assert isinstance(instance, ReRankStrategy)
@@ -201,6 +237,8 @@ def test_ac3_factory_rejects_missing_build_flag(
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
assert "BUILD_RERANK_INLIER_COUNT is OFF" in str(exc_info.value)
assert any(
@@ -219,6 +257,8 @@ def test_ac3_factory_does_not_load_module_when_flag_off(
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
assert module_name not in sys.modules
@@ -256,6 +296,8 @@ def test_ac5_factory_emits_info_log_on_success(
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
assert isinstance(instance, ReRankStrategy)
records = [
@@ -281,6 +323,8 @@ def test_ac6_strategy_resolution(monkeypatch, strategy_module_cleanup) -> None:
config,
tile_store=_FakeTileStore(),
lightglue_runtime=_FakeLightGlueRuntime(),
feature_extractor=_FakeFeatureExtractor(),
clock=_FakeClock(),
)
assert isinstance(instance, fake_cls)
assert isinstance(instance, ReRankStrategy)
@@ -373,12 +417,18 @@ def test_nfr_perf_factory_under_50ms_p99(
config = _config_with_strategy(strategy)
tile_store = _FakeTileStore()
lightglue_runtime = _FakeLightGlueRuntime()
feature_extractor = _FakeFeatureExtractor()
clock = _FakeClock()
durations_ms: list[float] = []
for _ in range(100):
t0 = time.perf_counter()
build_rerank_strategy(
config, tile_store=tile_store, lightglue_runtime=lightglue_runtime
config,
tile_store=tile_store,
lightglue_runtime=lightglue_runtime,
feature_extractor=feature_extractor,
clock=clock,
)
durations_ms.append((time.perf_counter() - t0) * 1000.0)