[AZ-336] C2 VprStrategy: Protocol + DTOs + factory + composition

Foundational scaffolding for every concrete C2 backbone (UltraVPR,
NetVLAD, MegaLoc, MixVPR, SelaVPR, EigenPlaces, SALAD — AZ-337..AZ-340)
and the C2.5 ReRanker consumer side. No backbone is implemented here.

* VprStrategy Protocol (embed_query / retrieve_topk / descriptor_dim)
  + BackbonePreprocessor C2-internal Protocol (NOT in Public API per
  description.md § 6).
* DTOs in L1 _types/vpr.py — VprQuery, VprCandidate, VprResult; all
  frozen + slots; tuple-not-list for VprResult.candidates so the
  immutability invariant truly holds.
* Error family: VprError + VprBackboneError + VprPreprocessError +
  IndexUnavailableError; same-named but namespace-distinct from
  c6_tile_cache.IndexUnavailableError (the c2 family is the closed
  envelope C5 / C2.5 consume; concrete strategies rewrap the C6 form).
* C2VprConfig (strategy enum + backbone_weights_path + faiss_index_path)
  with strict validation at load; registered into Config.components on
  c2_vpr import.
* build_vpr_strategy factory with 7-strategy resolution table, lazy
  import, BUILD_VPR_<variant> gating, ImportError→
  StrategyNotAvailableError mapping, and pre-flight descriptor_dim
  match against DescriptorIndex.descriptor_dim() — mismatch fires
  ConfigError at startup, NOT at first frame.

Contract change vs the v1.0.0 draft: factory takes descriptor_index:
DescriptorIndex (not tile_store: TileStore) because descriptor_dim()
lives on DescriptorIndex per C6's Public API. The contract markdown
is updated to match.

Architecture: VprCandidate.tile_id is a plain (zoom, lat, lon) tuple,
keeping _types/ (L1) free of any c6_tile_cache (L3) import per
module-layout.md. Consumers reconstruct TileId at the C6 boundary.

Excluded per task spec:
* Concrete backbones (AZ-337..AZ-340).
* FAISS HNSW retrieve wiring (AZ-341).
* DescriptorNormaliser helper (AZ-283, already shipped).
* AC-9 single-thread binding — deferred per task spec Risk 4 until the
  generic compose_root thread-binding registry is in place (today
  each factory owns its own, e.g. fc_factory).

Tests: 45 ACs + NFRs in tests/unit/c2_vpr/test_protocol_conformance.py
covering AC-1..AC-8, the error family, the config validation, the
factory NFR (p99 ≤ 50 ms). The legacy smoke test is removed. Full
sweep 973 passed, 2 skipped (CI-only cmake / actionlint).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-12 05:25:35 +03:00
parent 823c0f1b2e
commit 3665acef66
12 changed files with 1202 additions and 56 deletions
@@ -1,6 +1,51 @@
"""C2 VPR component — Public API."""
"""C2 VPR — Public API (AZ-336).
from gps_denied_onboard._types.vpr import VprQuery, VprResult
Per ``vpr_strategy_protocol.md`` v1.0.0 the public surface consists
of:
- :class:`VprStrategy` Protocol (3 methods).
- DTOs re-exported from :mod:`gps_denied_onboard._types.vpr` (the L1
home for cross-component DTOs): :class:`VprQuery`,
:class:`VprCandidate`, :class:`VprResult`.
- Error family rooted at :class:`VprError`; three documented
subtypes (:class:`VprBackboneError`, :class:`VprPreprocessError`,
:class:`IndexUnavailableError`).
- Config block :class:`C2VprConfig` (registered on import).
:class:`BackbonePreprocessor` is C2-internal (see
``components/02_c2_vpr/description.md`` § 6) and intentionally NOT
re-exported.
Concrete strategies (``UltraVprStrategy``, ``NetVladStrategy``,
``MegaLocStrategy``, ``MixVprStrategy``, ``SelaVprStrategy``,
``EigenPlacesStrategy``, ``SaladStrategy``) live in sibling modules
and are imported lazily by
:mod:`gps_denied_onboard.runtime_root.vpr_factory` — Risk-2
mitigation: this ``__init__.py`` MUST NOT import any concrete
strategy module.
"""
from gps_denied_onboard._types.vpr import VprCandidate, VprQuery, VprResult
from gps_denied_onboard.components.c2_vpr.config import C2VprConfig
from gps_denied_onboard.components.c2_vpr.errors import (
IndexUnavailableError,
VprBackboneError,
VprError,
VprPreprocessError,
)
from gps_denied_onboard.components.c2_vpr.interface import VprStrategy
from gps_denied_onboard.config.schema import register_component_block
__all__ = ["VprQuery", "VprResult", "VprStrategy"]
register_component_block("c2_vpr", C2VprConfig)
__all__ = [
"C2VprConfig",
"IndexUnavailableError",
"VprBackboneError",
"VprCandidate",
"VprError",
"VprPreprocessError",
"VprQuery",
"VprResult",
"VprStrategy",
]
@@ -0,0 +1,60 @@
"""C2-internal ``BackbonePreprocessor`` Protocol (AZ-336).
The preprocessor is the resize / crop / normalise step that turns a
``NavCameraFrame`` into the tensor the backbone's forward pass
expects. It is C2-internal — each concrete :class:`VprStrategy`
owns its own preprocessor; sharing across backbones is forbidden per
``components/02_c2_vpr/description.md`` § 6 (preprocessing parameters
are tightly coupled to the backbone weights, so a shared
preprocessor would let a NetVLAD instance corrupt UltraVPR's input
layout).
This Protocol is NOT re-exported from ``c2_vpr.__init__`` — keeping
it inside the package enforces the description.md § 6 boundary.
"""
from __future__ import annotations
from typing import TYPE_CHECKING, Protocol, runtime_checkable
if TYPE_CHECKING:
import numpy as np
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.nav import NavCameraFrame
__all__ = ["BackbonePreprocessor"]
@runtime_checkable
class BackbonePreprocessor(Protocol):
"""Resize / crop / normalise per backbone's input contract.
Each :class:`VprStrategy` implementation owns its concrete
preprocessor (NOT shared across backbones). The strategy calls
:meth:`preprocess` inside :meth:`VprStrategy.embed_query` before
running the forward pass.
"""
def preprocess(
self,
frame: "NavCameraFrame",
calibration: "CameraCalibration",
) -> "np.ndarray":
"""Return the preprocessed input tensor in the backbone's layout.
Typical shape: ``(1, 3, H, W)`` NCHW float16 for TRT engines.
Raises :class:`VprPreprocessError` when the input frame
violates the backbone's contract (wrong colour channels,
calibration mismatch).
"""
...
def input_shape(self) -> tuple[int, ...]:
"""``(H, W)`` resize target the backbone expects.
Stable for the preprocessor's lifetime; consumed by tests to
assert preprocessing fidelity.
"""
...
@@ -0,0 +1,82 @@
"""C2 VPR strategy config block (AZ-336).
Registered into ``config.components['c2_vpr']`` by the package
``__init__.py``. The composition-root factory
:func:`gps_denied_onboard.runtime_root.vpr_factory.build_vpr_strategy`
reads this block to select the strategy and locate the backbone
weights + FAISS index sidecar.
``backbone_weights_path`` and ``faiss_index_path`` are required (no
default — paths are deployment-specific). They are typed
:class:`pathlib.Path` so YAML loaders that emit strings get coerced
at construction; ``__post_init__`` validates that both are non-empty.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from pathlib import Path
from typing import Final
from gps_denied_onboard.config.schema import ConfigError
__all__ = [
"C2VprConfig",
"KNOWN_STRATEGIES",
]
KNOWN_STRATEGIES: Final[frozenset[str]] = frozenset(
{
"ultra_vpr",
"net_vlad",
"mega_loc",
"mix_vpr",
"sela_vpr",
"eigen_places",
"salad",
}
)
@dataclass(frozen=True)
class C2VprConfig:
"""Per-component config for C2 VPR.
``strategy`` selects exactly one of the seven backbones
(see :data:`KNOWN_STRATEGIES`); the composition-root factory
respects compile-time ``BUILD_VPR_<variant>`` gating on top of
this label.
``backbone_weights_path`` is the on-disk location of the
backbone weights (TRT engine, ONNX model, PyTorch state dict —
per strategy). ``faiss_index_path`` is the location of the
pre-built FAISS HNSW index file (C6 ``DescriptorIndex`` reads
its sidecar there).
"""
strategy: str = "net_vlad"
backbone_weights_path: Path = field(default_factory=lambda: Path("/models/vpr/weights"))
faiss_index_path: Path = field(default_factory=lambda: Path("/cache/vpr/index.faiss"))
def __post_init__(self) -> None:
if self.strategy not in KNOWN_STRATEGIES:
raise ConfigError(
f"C2VprConfig.strategy={self.strategy!r} not in "
f"{sorted(KNOWN_STRATEGIES)}"
)
if not isinstance(self.backbone_weights_path, Path):
object.__setattr__(
self, "backbone_weights_path", Path(self.backbone_weights_path)
)
if not isinstance(self.faiss_index_path, Path):
object.__setattr__(
self, "faiss_index_path", Path(self.faiss_index_path)
)
if not str(self.backbone_weights_path):
raise ConfigError(
"C2VprConfig.backbone_weights_path must be non-empty"
)
if not str(self.faiss_index_path):
raise ConfigError(
"C2VprConfig.faiss_index_path must be non-empty"
)
@@ -0,0 +1,66 @@
"""C2 VprStrategy error taxonomy (AZ-336).
Every ``VprStrategy`` method raises only members of :class:`VprError`.
Lower-level exceptions from the backbone runtime (TRT deserialize,
CUDA OOM, ONNX runtime IO mismatch, FAISS index torn mmap) MUST be
caught and rewrapped by the concrete strategy — the contract closes
the error envelope so consumers can ``except VprError`` once and
handle the family.
A separate composition-time error
(:class:`gps_denied_onboard.runtime_root.errors.StrategyNotAvailableError`)
lives outside this family — it is raised by the factory, not by a
``VprStrategy`` method.
Note: C6 ``c6_tile_cache.errors`` also defines an
``IndexUnavailableError`` for the underlying ``DescriptorIndex``
search path. The two classes are intentionally distinct (same name,
different namespaces): the C2 family is the closed envelope a C5/C2.5
consumer sees; the C6 family is the storage-layer error a concrete
strategy is responsible for rewrapping.
"""
from __future__ import annotations
__all__ = [
"IndexUnavailableError",
"VprBackboneError",
"VprError",
"VprPreprocessError",
]
class VprError(Exception):
"""Base class for the C2 VPR error family.
Caught at the runtime root; downstream effect per AC-1.4:
C5 falls back to VIO-only with provenance ``visual_propagated``.
"""
class VprBackboneError(VprError):
"""Backbone forward pass failed.
CUDA OOM, TRT engine deserialize mismatch, ONNX runtime IO
shape mismatch. Logged at ERROR; one FDR record per occurrence.
"""
class VprPreprocessError(VprError):
"""Input frame violates the backbone's preprocessing contract.
Wrong colour channels, calibration mismatch. Logged at ERROR;
one FDR record per occurrence. The concrete preprocessor
(each strategy owns its own per description.md § 6) raises this
and the strategy lets it propagate unchanged.
"""
class IndexUnavailableError(VprError):
"""FAISS index handle invalid for the strategy's retrieve path.
Post-F8 reboot before warm-up, out-of-band file replacement
caught by the underlying mmap defence, dim mismatch caught at
search time. The strategy MUST raise this rather than return
stale candidates (C2-ST-01).
"""
@@ -1,17 +1,113 @@
"""C2 `VprStrategy` Protocol.
"""C2 ``VprStrategy`` Protocol (AZ-336).
Concrete strategies: UltraVPR (primary), MegaLoc, MixVPR, SelaVPR, EigenPlaces,
NetVLAD, SALAD. See `_docs/02_document/components/02_c2_vpr/`.
PEP 544 ``typing.Protocol`` with ``runtime_checkable=True``; three
methods spanning the camera-ingest hot path
(:meth:`embed_query` + :meth:`retrieve_topk`) and the composition-time
pre-flight check (:meth:`descriptor_dim`).
Concrete impls — :class:`UltraVprStrategy` (AZ-337),
:class:`NetVladStrategy` (AZ-338), :class:`MegaLocStrategy` /
:class:`MixVprStrategy` (AZ-339), :class:`SelaVprStrategy` /
:class:`EigenPlacesStrategy` / :class:`SaladStrategy` (AZ-340) — live
in sibling modules and are imported lazily by
:mod:`gps_denied_onboard.runtime_root.vpr_factory`.
The contract at
``_docs/02_document/contracts/c2_vpr/vpr_strategy_protocol.md`` v1.0.0
is the authoritative shape; this module mirrors it 1:1.
"""
from __future__ import annotations
from typing import Protocol
from typing import TYPE_CHECKING, Protocol, runtime_checkable
from gps_denied_onboard._types.vpr import VprQuery, VprResult
if TYPE_CHECKING:
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.nav import NavCameraFrame
from gps_denied_onboard._types.vpr import VprQuery, VprResult
__all__ = ["VprStrategy"]
@runtime_checkable
class VprStrategy(Protocol):
"""Visual Place Recognition strategy: encode → retrieve top-K candidates."""
"""Single-camera visual place recognition strategy.
def retrieve(self, query: VprQuery, top_k: int = 10) -> VprResult: ...
Stateless per-frame; the only persistent state is the loaded
backbone weights and the C6-owned FAISS index handle (passed in
via constructor by the strategy's ``create(...)`` factory).
Invariants (see ``vpr_strategy_protocol.md`` v1.0.0):
- **INV-1 single-threaded** — each instance is bound to one
ingest thread; the composition root enforces. Concurrent
:meth:`embed_query` calls on a single instance race the GPU
stream.
- **INV-2 stateless per-frame** — no implicit dependency on
prior frames; reordering :meth:`embed_query` calls yields
identical embeddings.
- **INV-3 L2-normalised** — :attr:`VprQuery.embedding` is
L2-normalised before return (cosine ≡ Euclidean on the
FAISS HNSW lookup).
- **INV-4 top-K size + order** — :meth:`retrieve_topk` returns
exactly ``k`` candidates, ascending by
:attr:`VprCandidate.descriptor_distance`.
- **INV-5 backbone_label non-empty** — every
:attr:`VprResult.backbone_label` matches the strategy's
``BUILD_VPR_<variant>`` lowercase form.
- **INV-6 deterministic** — same frame + calibration + corpus
→ identical embedding + identical top-K (bit-exact for
float32; ULP-tolerant for float16).
- **INV-7 descriptor_dim stable** — :meth:`descriptor_dim`
never changes after construction; reflects the loaded
weights' output dim, NOT a config knob.
Error envelope: only members of
:class:`gps_denied_onboard.components.c2_vpr.errors.VprError`
escape the three methods. Lower-level exceptions (CUDA, TRT,
FAISS) MUST be rewrapped by the concrete strategy.
"""
def embed_query(
self,
frame: "NavCameraFrame",
calibration: "CameraCalibration",
) -> "VprQuery":
"""Run the backbone forward pass; return a ``VprQuery``.
Calibration is consumed by the strategy's internal
:class:`BackbonePreprocessor` for resize / crop / normalise.
Raises :class:`VprBackboneError` on backbone failure
(CUDA OOM, TRT deserialize mismatch, etc.) and
:class:`VprPreprocessError` on preprocessor contract
violation.
"""
...
def retrieve_topk(self, query: "VprQuery", k: int) -> "VprResult":
"""Run the FAISS HNSW top-K lookup against the corpus index.
The strategy holds the FAISS index handle
(constructor-injected from C6's ``DescriptorIndex``).
Top-K candidates are returned ascending by
:attr:`VprCandidate.descriptor_distance`.
Raises :class:`IndexUnavailableError` when the FAISS index
handle is invalid (post-F8 reboot before warm-up;
out-of-band file replacement caught by mmap defence;
fewer than ``k`` indexed vectors).
"""
...
def descriptor_dim(self) -> int:
"""Backbone embedding dimensionality.
Examples: 512 for UltraVPR; 4096 for NetVLAD-VGG16.
Stable for the strategy's lifetime. Consumed by the
composition root at startup to pre-validate index
compatibility against the C6 ``DescriptorIndex`` sidecar
(mismatch → :class:`ConfigError` at startup, NOT at first
frame).
"""
...