[AZ-342] C2.5 ReRankStrategy: Protocol + DTOs + factory + composition

Foundational scaffolding for the InlierCountReRanker (AZ-343) and
the future C3 CrossDomainMatcher consumer (AZ-344). No concrete
re-ranker is implemented here.

* ReRankStrategy Protocol (single rerank(frame, vpr_result, n,
  calibration) -> RerankResult method) with all 8 invariants in the
  docstring — notably INV-8 drop-and-continue (per-candidate failure
  NEVER propagates unless every candidate fails).
* DTOs moved to L1 _types/rerank.py — RerankCandidate, RerankResult;
  frozen+slots; tuple-not-list for RerankResult.candidates; tile_id
  encoded as (zoom_level, lat, lon) tuple to keep _types/ free of any
  c6_tile_cache (L3) import per module-layout.md.
* Error family: RerankError + RerankBackboneError +
  RerankAllCandidatesFailedError. Only RerankAllCandidatesFailedError
  escapes rerank(); RerankBackboneError is caught inside the per-
  candidate loop, logged ERROR, FDR-stamped, candidate dropped.
* C2_5RerankConfig (strategy enum default "inlier_count", top_n int
  default 3) with strict validation at load; registered into
  Config.components on c2_5_rerank import.
* build_rerank_strategy(config, *, tile_store, lightglue_runtime)
  factory: 1-strategy resolution table, lazy import,
  BUILD_RERANK_<variant> gate, ImportError → StrategyNotAvailableError
  mapping. The shared LightGlueRuntime is constructor-injected
  (R14 fix: neither C2.5 nor C3 owns its lifecycle).

Renamed the Protocol from the existing stub "RerankStrategy" to
"ReRankStrategy" to match the contract; updated module-layout.md.
Removed the legacy RerankResult shape from _types/vpr.py — the
v1.0.0 shape lives in _types/rerank.py.

Excluded per task spec:
* Concrete InlierCountReRanker (AZ-343).
* C3 matcher protocol task (AZ-344, next in batch).
* AC-9 single-thread binding + AC-10 LightGlueRuntime identity-share
  between C2.5/C3 — deferred per task spec Risk 3 until the generic
  compose_root thread-binding registry and the C3 factory both land.

Tests: AC-1..AC-8 + AC-11 + NFR-perf-factory in
tests/unit/c2_5_rerank/test_protocol_conformance.py. The legacy
smoke test is removed. Full sweep: 997 passed (one pre-existing
flake in test_az296_takeoff_abort, subprocess timing, unrelated to
this commit; passes in isolation).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-12 05:31:27 +03:00
parent 3665acef66
commit d6756f1855
12 changed files with 871 additions and 54 deletions
@@ -1,17 +1,98 @@
"""C2.5 `RerankStrategy` Protocol.
"""C2.5 ``ReRankStrategy`` Protocol (AZ-342).
Default: `InlierBasedReranker` (single-pair LightGlue inlier counter, K=10 → N=3).
See `_docs/02_document/components/03_c2_5_rerank/`.
PEP 544 ``typing.Protocol`` with ``runtime_checkable=True``; a single
``rerank`` method that consumes a C2 :class:`VprResult` and produces
a :class:`RerankResult` ranked by single-pair LightGlue inlier count.
Concrete impl — :class:`InlierCountReRanker` (AZ-343) — lives in a
sibling module and is imported lazily by
:mod:`gps_denied_onboard.runtime_root.rerank_factory`.
The contract at
``_docs/02_document/contracts/c2_5_rerank/rerank_strategy_protocol.md``
v1.0.0 is the authoritative shape; this module mirrors it 1:1.
"""
from __future__ import annotations
from typing import Protocol
from typing import TYPE_CHECKING, Protocol, runtime_checkable
from gps_denied_onboard._types.vpr import RerankResult, VprResult
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.rerank import RerankResult
from gps_denied_onboard._types.vpr import VprResult
__all__ = ["ReRankStrategy"]
class RerankStrategy(Protocol):
"""Re-rank C2's top-K candidates down to N via cross-domain match scoring."""
@runtime_checkable
class ReRankStrategy(Protocol):
"""Single-camera re-rank strategy.
def rerank(self, vpr_result: VprResult, n_keep: int = 3) -> RerankResult: ...
Stateless per-frame; the only persistent state is the
constructor-injected
:class:`gps_denied_onboard.helpers.lightglue_runtime.LightGlueRuntime`
helper handle and the :class:`TileStore` Public API reference.
Invariants (see ``rerank_strategy_protocol.md`` v1.0.0):
- **INV-1 single-threaded** — each instance is bound to one
ingest thread; the shared ``LightGlueRuntime`` requires serial
access. Concurrent :meth:`rerank` calls on a single instance
race the GPU stream.
- **INV-2 stateless per-frame** — same inputs → same surviving
candidates in same order.
- **INV-3 top-N descending by inlier_count** — ties broken
deterministically by ``descriptor_distance`` ascending (the
C2-stage value carried forward).
- **INV-4 candidates length bounded** — ``0 < len <= n`` when
returned (zero raises :class:`RerankAllCandidatesFailedError`);
never exceeds ``n``; never exceeds
``len(vpr_result.candidates)``.
- **INV-5 descriptor_distance carried forward unchanged** — the
C2-stage value is preserved on every survivor for FDR
provenance.
- **INV-6 tile_pixels_handle is a reference, NOT a copy** —
``RerankCandidate.tile_pixels_handle`` is the same handle
returned by ``TileStore.read_tile_pixels`` (page-cache
backed).
- **INV-7 deterministic per tuple** — same ``(frame,
vpr_result, corpus, helper)`` → bit-identical
:class:`RerankResult`.
- **INV-8 drop-and-continue** — a per-candidate exception
NEVER propagates out of :meth:`rerank` unless EVERY candidate
fails. C3 relies on this partial-input tolerance.
Error envelope: only :class:`RerankAllCandidatesFailedError`
escapes :meth:`rerank`; per-candidate
:class:`RerankBackboneError` / ``TileFetchError`` from C6 are
caught inside the loop and turned into dropped candidates +
ERROR logs + per-occurrence FDR records.
"""
def rerank(
self,
frame: "NavCameraFrame",
vpr_result: "VprResult",
n: int,
calibration: "CameraCalibration",
) -> "RerankResult":
"""Re-rank the top-K candidates down to top-N by inlier count.
For each ``candidate`` in ``vpr_result.candidates``:
1. Fetch tile pixels via ``TileStore.read_tile_pixels(candidate.tile_id)``.
2. Run a single-pair LightGlue forward via the shared
:class:`LightGlueRuntime` (frame ↔ tile).
3. Record the inlier count.
Sort candidates descending by inlier count; return the top-N
as a :class:`RerankResult`. Drop-and-continue semantics
apply per INV-8.
Raises:
RerankAllCandidatesFailedError: zero survivors after
the per-candidate loop.
"""
...