mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 21:51:12 +00:00
[AZ-342] C2.5 ReRankStrategy: Protocol + DTOs + factory + composition
Foundational scaffolding for the InlierCountReRanker (AZ-343) and the future C3 CrossDomainMatcher consumer (AZ-344). No concrete re-ranker is implemented here. * ReRankStrategy Protocol (single rerank(frame, vpr_result, n, calibration) -> RerankResult method) with all 8 invariants in the docstring — notably INV-8 drop-and-continue (per-candidate failure NEVER propagates unless every candidate fails). * DTOs moved to L1 _types/rerank.py — RerankCandidate, RerankResult; frozen+slots; tuple-not-list for RerankResult.candidates; tile_id encoded as (zoom_level, lat, lon) tuple to keep _types/ free of any c6_tile_cache (L3) import per module-layout.md. * Error family: RerankError + RerankBackboneError + RerankAllCandidatesFailedError. Only RerankAllCandidatesFailedError escapes rerank(); RerankBackboneError is caught inside the per- candidate loop, logged ERROR, FDR-stamped, candidate dropped. * C2_5RerankConfig (strategy enum default "inlier_count", top_n int default 3) with strict validation at load; registered into Config.components on c2_5_rerank import. * build_rerank_strategy(config, *, tile_store, lightglue_runtime) factory: 1-strategy resolution table, lazy import, BUILD_RERANK_<variant> gate, ImportError → StrategyNotAvailableError mapping. The shared LightGlueRuntime is constructor-injected (R14 fix: neither C2.5 nor C3 owns its lifecycle). Renamed the Protocol from the existing stub "RerankStrategy" to "ReRankStrategy" to match the contract; updated module-layout.md. Removed the legacy RerankResult shape from _types/vpr.py — the v1.0.0 shape lives in _types/rerank.py. Excluded per task spec: * Concrete InlierCountReRanker (AZ-343). * C3 matcher protocol task (AZ-344, next in batch). * AC-9 single-thread binding + AC-10 LightGlueRuntime identity-share between C2.5/C3 — deferred per task spec Risk 3 until the generic compose_root thread-binding registry and the C3 factory both land. Tests: AC-1..AC-8 + AC-11 + NFR-perf-factory in tests/unit/c2_5_rerank/test_protocol_conformance.py. The legacy smoke test is removed. Full sweep: 997 passed (one pre-existing flake in test_az296_takeoff_abort, subprocess timing, unrelated to this commit; passes in isolation). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,6 +1,42 @@
|
||||
"""C2.5 Rerank component — Public API."""
|
||||
"""C2.5 ReRank — Public API (AZ-342).
|
||||
|
||||
from gps_denied_onboard._types.vpr import RerankResult
|
||||
from gps_denied_onboard.components.c2_5_rerank.interface import RerankStrategy
|
||||
Per ``rerank_strategy_protocol.md`` v1.0.0 the public surface
|
||||
consists of:
|
||||
|
||||
__all__ = ["RerankResult", "RerankStrategy"]
|
||||
- :class:`ReRankStrategy` Protocol (one method).
|
||||
- DTOs re-exported from :mod:`gps_denied_onboard._types.rerank` (the
|
||||
L1 home for cross-component DTOs): :class:`RerankCandidate`,
|
||||
:class:`RerankResult`.
|
||||
- Error family rooted at :class:`RerankError`; two documented
|
||||
subtypes (:class:`RerankBackboneError`,
|
||||
:class:`RerankAllCandidatesFailedError`).
|
||||
- Config block :class:`C2_5RerankConfig` (registered on import).
|
||||
|
||||
Concrete strategy (``InlierCountReRanker``, AZ-343) lives in a
|
||||
sibling module and is imported lazily by
|
||||
:mod:`gps_denied_onboard.runtime_root.rerank_factory` — Risk-2
|
||||
mitigation: this ``__init__.py`` MUST NOT import any concrete
|
||||
strategy module.
|
||||
"""
|
||||
|
||||
from gps_denied_onboard._types.rerank import RerankCandidate, RerankResult
|
||||
from gps_denied_onboard.components.c2_5_rerank.config import C2_5RerankConfig
|
||||
from gps_denied_onboard.components.c2_5_rerank.errors import (
|
||||
RerankAllCandidatesFailedError,
|
||||
RerankBackboneError,
|
||||
RerankError,
|
||||
)
|
||||
from gps_denied_onboard.components.c2_5_rerank.interface import ReRankStrategy
|
||||
from gps_denied_onboard.config.schema import register_component_block
|
||||
|
||||
register_component_block("c2_5_rerank", C2_5RerankConfig)
|
||||
|
||||
__all__ = [
|
||||
"C2_5RerankConfig",
|
||||
"ReRankStrategy",
|
||||
"RerankAllCandidatesFailedError",
|
||||
"RerankBackboneError",
|
||||
"RerankCandidate",
|
||||
"RerankError",
|
||||
"RerankResult",
|
||||
]
|
||||
|
||||
@@ -0,0 +1,54 @@
|
||||
"""C2.5 ReRankStrategy config block (AZ-342).
|
||||
|
||||
Registered into ``config.components['c2_5_rerank']`` by the package
|
||||
``__init__.py``. The composition-root factory
|
||||
:func:`gps_denied_onboard.runtime_root.rerank_factory.build_rerank_strategy`
|
||||
reads this block to select the strategy and configure the top-N cut.
|
||||
|
||||
``top_n`` is the strategy-side cap on the returned
|
||||
:attr:`RerankResult.candidates` length; the composition root binds
|
||||
``n`` per-frame from this value (default 3 per the epic's K=10 → N=3
|
||||
spec).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from typing import Final
|
||||
|
||||
from gps_denied_onboard.config.schema import ConfigError
|
||||
|
||||
__all__ = [
|
||||
"C2_5RerankConfig",
|
||||
"KNOWN_STRATEGIES",
|
||||
]
|
||||
|
||||
KNOWN_STRATEGIES: Final[frozenset[str]] = frozenset({"inlier_count"})
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class C2_5RerankConfig:
|
||||
"""Per-component config for C2.5 ReRank.
|
||||
|
||||
``strategy`` selects exactly one of the registered re-rankers
|
||||
(today only ``inlier_count``); the composition-root factory
|
||||
respects compile-time ``BUILD_RERANK_<variant>`` gating on top
|
||||
of this label.
|
||||
|
||||
``top_n`` is the per-frame N cap (1..K-1). Default 3 (the epic's
|
||||
K=10 → N=3 spec).
|
||||
"""
|
||||
|
||||
strategy: str = "inlier_count"
|
||||
top_n: int = 3
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
if self.strategy not in KNOWN_STRATEGIES:
|
||||
raise ConfigError(
|
||||
f"C2_5RerankConfig.strategy={self.strategy!r} not in "
|
||||
f"{sorted(KNOWN_STRATEGIES)}"
|
||||
)
|
||||
if self.top_n < 1:
|
||||
raise ConfigError(
|
||||
f"C2_5RerankConfig.top_n must be >= 1; got {self.top_n}"
|
||||
)
|
||||
@@ -0,0 +1,56 @@
|
||||
"""C2.5 ReRankStrategy error taxonomy (AZ-342).
|
||||
|
||||
The family is intentionally narrow: a per-candidate failure is the
|
||||
normal case (drop-and-continue, INV-8) and is signalled via
|
||||
``candidates_dropped`` in the returned :class:`RerankResult` —
|
||||
NOT via an exception. An exception escapes ``rerank`` only when
|
||||
EVERY candidate fails (:class:`RerankAllCandidatesFailedError`)
|
||||
which is the C5 → VIO-only-fallback trigger per AC-3.5.
|
||||
|
||||
:class:`RerankBackboneError` is raised INSIDE the per-candidate loop,
|
||||
caught by the strategy, logged ERROR, FDR-stamped, and the
|
||||
candidate is dropped. It is exposed publicly so the per-candidate
|
||||
log + FDR taxonomy is observable and so future re-rankers using a
|
||||
different backbone can re-raise the same kind.
|
||||
|
||||
``TileFetchError`` is C6-owned
|
||||
(``c6_tile_cache.errors.TileNotFoundError`` / ``TileFsError``); the
|
||||
strategy catches it in the per-candidate loop and treats it
|
||||
identically to :class:`RerankBackboneError`.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
__all__ = [
|
||||
"RerankAllCandidatesFailedError",
|
||||
"RerankBackboneError",
|
||||
"RerankError",
|
||||
]
|
||||
|
||||
|
||||
class RerankError(Exception):
|
||||
"""Base class for the C2.5 rerank error family.
|
||||
|
||||
Caught at the runtime root only when
|
||||
:class:`RerankAllCandidatesFailedError` fires; per-candidate
|
||||
failures stay inside the strategy.
|
||||
"""
|
||||
|
||||
|
||||
class RerankBackboneError(RerankError):
|
||||
"""Per-candidate LightGlue forward-pass failure.
|
||||
|
||||
CUDA OOM, TRT engine deserialize mismatch. Logged at ERROR; one
|
||||
FDR record per occurrence; the offending candidate is dropped
|
||||
from the rerank set; the surrounding ``rerank`` call continues
|
||||
with the remaining candidates (INV-8).
|
||||
"""
|
||||
|
||||
|
||||
class RerankAllCandidatesFailedError(RerankError):
|
||||
"""Zero survivors after the per-candidate loop.
|
||||
|
||||
Every candidate's LightGlue or tile fetch failed. Logged at
|
||||
ERROR; FDR record ``kind=rerank.all_failed``. C5 falls back to
|
||||
VIO-only with provenance ``visual_propagated`` (AC-3.5).
|
||||
"""
|
||||
@@ -1,17 +1,98 @@
|
||||
"""C2.5 `RerankStrategy` Protocol.
|
||||
"""C2.5 ``ReRankStrategy`` Protocol (AZ-342).
|
||||
|
||||
Default: `InlierBasedReranker` (single-pair LightGlue inlier counter, K=10 → N=3).
|
||||
See `_docs/02_document/components/03_c2_5_rerank/`.
|
||||
PEP 544 ``typing.Protocol`` with ``runtime_checkable=True``; a single
|
||||
``rerank`` method that consumes a C2 :class:`VprResult` and produces
|
||||
a :class:`RerankResult` ranked by single-pair LightGlue inlier count.
|
||||
|
||||
Concrete impl — :class:`InlierCountReRanker` (AZ-343) — lives in a
|
||||
sibling module and is imported lazily by
|
||||
:mod:`gps_denied_onboard.runtime_root.rerank_factory`.
|
||||
|
||||
The contract at
|
||||
``_docs/02_document/contracts/c2_5_rerank/rerank_strategy_protocol.md``
|
||||
v1.0.0 is the authoritative shape; this module mirrors it 1:1.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Protocol
|
||||
from typing import TYPE_CHECKING, Protocol, runtime_checkable
|
||||
|
||||
from gps_denied_onboard._types.vpr import RerankResult, VprResult
|
||||
if TYPE_CHECKING:
|
||||
from gps_denied_onboard._types.calibration import CameraCalibration
|
||||
from gps_denied_onboard._types.nav import NavCameraFrame
|
||||
from gps_denied_onboard._types.rerank import RerankResult
|
||||
from gps_denied_onboard._types.vpr import VprResult
|
||||
|
||||
__all__ = ["ReRankStrategy"]
|
||||
|
||||
|
||||
class RerankStrategy(Protocol):
|
||||
"""Re-rank C2's top-K candidates down to N via cross-domain match scoring."""
|
||||
@runtime_checkable
|
||||
class ReRankStrategy(Protocol):
|
||||
"""Single-camera re-rank strategy.
|
||||
|
||||
def rerank(self, vpr_result: VprResult, n_keep: int = 3) -> RerankResult: ...
|
||||
Stateless per-frame; the only persistent state is the
|
||||
constructor-injected
|
||||
:class:`gps_denied_onboard.helpers.lightglue_runtime.LightGlueRuntime`
|
||||
helper handle and the :class:`TileStore` Public API reference.
|
||||
|
||||
Invariants (see ``rerank_strategy_protocol.md`` v1.0.0):
|
||||
|
||||
- **INV-1 single-threaded** — each instance is bound to one
|
||||
ingest thread; the shared ``LightGlueRuntime`` requires serial
|
||||
access. Concurrent :meth:`rerank` calls on a single instance
|
||||
race the GPU stream.
|
||||
- **INV-2 stateless per-frame** — same inputs → same surviving
|
||||
candidates in same order.
|
||||
- **INV-3 top-N descending by inlier_count** — ties broken
|
||||
deterministically by ``descriptor_distance`` ascending (the
|
||||
C2-stage value carried forward).
|
||||
- **INV-4 candidates length bounded** — ``0 < len <= n`` when
|
||||
returned (zero raises :class:`RerankAllCandidatesFailedError`);
|
||||
never exceeds ``n``; never exceeds
|
||||
``len(vpr_result.candidates)``.
|
||||
- **INV-5 descriptor_distance carried forward unchanged** — the
|
||||
C2-stage value is preserved on every survivor for FDR
|
||||
provenance.
|
||||
- **INV-6 tile_pixels_handle is a reference, NOT a copy** —
|
||||
``RerankCandidate.tile_pixels_handle`` is the same handle
|
||||
returned by ``TileStore.read_tile_pixels`` (page-cache
|
||||
backed).
|
||||
- **INV-7 deterministic per tuple** — same ``(frame,
|
||||
vpr_result, corpus, helper)`` → bit-identical
|
||||
:class:`RerankResult`.
|
||||
- **INV-8 drop-and-continue** — a per-candidate exception
|
||||
NEVER propagates out of :meth:`rerank` unless EVERY candidate
|
||||
fails. C3 relies on this partial-input tolerance.
|
||||
|
||||
Error envelope: only :class:`RerankAllCandidatesFailedError`
|
||||
escapes :meth:`rerank`; per-candidate
|
||||
:class:`RerankBackboneError` / ``TileFetchError`` from C6 are
|
||||
caught inside the loop and turned into dropped candidates +
|
||||
ERROR logs + per-occurrence FDR records.
|
||||
"""
|
||||
|
||||
def rerank(
|
||||
self,
|
||||
frame: "NavCameraFrame",
|
||||
vpr_result: "VprResult",
|
||||
n: int,
|
||||
calibration: "CameraCalibration",
|
||||
) -> "RerankResult":
|
||||
"""Re-rank the top-K candidates down to top-N by inlier count.
|
||||
|
||||
For each ``candidate`` in ``vpr_result.candidates``:
|
||||
|
||||
1. Fetch tile pixels via ``TileStore.read_tile_pixels(candidate.tile_id)``.
|
||||
2. Run a single-pair LightGlue forward via the shared
|
||||
:class:`LightGlueRuntime` (frame ↔ tile).
|
||||
3. Record the inlier count.
|
||||
|
||||
Sort candidates descending by inlier count; return the top-N
|
||||
as a :class:`RerankResult`. Drop-and-continue semantics
|
||||
apply per INV-8.
|
||||
|
||||
Raises:
|
||||
RerankAllCandidatesFailedError: zero survivors after
|
||||
the per-candidate loop.
|
||||
"""
|
||||
...
|
||||
|
||||
Reference in New Issue
Block a user