[AZ-336] C2 VprStrategy: Protocol + DTOs + factory + composition

Foundational scaffolding for every concrete C2 backbone (UltraVPR,
NetVLAD, MegaLoc, MixVPR, SelaVPR, EigenPlaces, SALAD — AZ-337..AZ-340)
and the C2.5 ReRanker consumer side. No backbone is implemented here.

* VprStrategy Protocol (embed_query / retrieve_topk / descriptor_dim)
  + BackbonePreprocessor C2-internal Protocol (NOT in Public API per
  description.md § 6).
* DTOs in L1 _types/vpr.py — VprQuery, VprCandidate, VprResult; all
  frozen + slots; tuple-not-list for VprResult.candidates so the
  immutability invariant truly holds.
* Error family: VprError + VprBackboneError + VprPreprocessError +
  IndexUnavailableError; same-named but namespace-distinct from
  c6_tile_cache.IndexUnavailableError (the c2 family is the closed
  envelope C5 / C2.5 consume; concrete strategies rewrap the C6 form).
* C2VprConfig (strategy enum + backbone_weights_path + faiss_index_path)
  with strict validation at load; registered into Config.components on
  c2_vpr import.
* build_vpr_strategy factory with 7-strategy resolution table, lazy
  import, BUILD_VPR_<variant> gating, ImportError→
  StrategyNotAvailableError mapping, and pre-flight descriptor_dim
  match against DescriptorIndex.descriptor_dim() — mismatch fires
  ConfigError at startup, NOT at first frame.

Contract change vs the v1.0.0 draft: factory takes descriptor_index:
DescriptorIndex (not tile_store: TileStore) because descriptor_dim()
lives on DescriptorIndex per C6's Public API. The contract markdown
is updated to match.

Architecture: VprCandidate.tile_id is a plain (zoom, lat, lon) tuple,
keeping _types/ (L1) free of any c6_tile_cache (L3) import per
module-layout.md. Consumers reconstruct TileId at the C6 boundary.

Excluded per task spec:
* Concrete backbones (AZ-337..AZ-340).
* FAISS HNSW retrieve wiring (AZ-341).
* DescriptorNormaliser helper (AZ-283, already shipped).
* AC-9 single-thread binding — deferred per task spec Risk 4 until the
  generic compose_root thread-binding registry is in place (today
  each factory owns its own, e.g. fc_factory).

Tests: 45 ACs + NFRs in tests/unit/c2_vpr/test_protocol_conformance.py
covering AC-1..AC-8, the error family, the config validation, the
factory NFR (p99 ≤ 50 ms). The legacy smoke test is removed. Full
sweep 973 passed, 2 skipped (CI-only cmake / actionlint).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-12 05:25:35 +03:00
parent 823c0f1b2e
commit 3665acef66
12 changed files with 1202 additions and 56 deletions
@@ -4,7 +4,7 @@
**Producer task**: AZ-336 (`VprStrategy` Protocol + factory + composition)
**Consumer tasks**: AZ-337 (UltraVPR), AZ-338 (NetVLAD baseline), AZ-339 (MegaLoc + MixVPR), AZ-340 (SelaVPR + EigenPlaces + SALAD), AZ-341 (FAISS HNSW retrieve wiring), and downstream c2_5_rerank (AZ-256 / E-C2.5)
**Module-layout home**: `src/gps_denied_onboard/components/c2_vpr/interface.py` (Protocols), `src/gps_denied_onboard/components/c2_vpr/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/vpr_factory.py` (factory)
**Status**: draft, awaiting AZ-336 implementation
**Status**: v1.0.0 (AZ-336 implemented 2026-05-12)
## Purpose
@@ -69,25 +69,23 @@ class VprStrategy(Protocol):
```python
from dataclasses import dataclass
from uuid import UUID
import numpy as np
@dataclass(frozen=True, slots=True)
class VprQuery:
"""Backbone embedding for a single nav-camera frame. Produced by `VprStrategy.embed_query`; consumed by `VprStrategy.retrieve_topk` (same instance) or — in the C10 corpus-build path — by `DescriptorIndexBuilder` to populate the corpus descriptor matrix."""
frame_id: UUID
embedding: np.ndarray # shape (D,), dtype float16 or float32; L2-normalised
produced_at: int # monotonic_ns
frame_id: int # echoes NavCameraFrame.frame_id (the source carries int across the pipeline)
embedding: object # numpy.ndarray, shape (D,), dtype float16|float32; L2-normalised. typed object to keep _types/ free of numpy import-time dep.
produced_at: int # monotonic_ns from injected Clock
@dataclass(frozen=True, slots=True)
class VprCandidate:
"""One retrieval candidate from the top-K result."""
tile_id: tuple # composite (zoomLevel, lat, lon); see C6 TileRecord
descriptor_distance: float # backbone-specific metric (cosine for L2-normalised; Euclidean for raw)
tile_id: tuple[int, float, float] # composite (zoom_level, lat, lon); matches c6_tile_cache.TileId. tuple form keeps _types/ free of an L1→L3 import per module-layout.md.
descriptor_distance: float # backbone-specific metric (cosine for L2-normalised; Euclidean for raw)
descriptor_dim: int
@@ -95,10 +93,10 @@ class VprCandidate:
class VprResult:
"""Top-K candidates from `VprStrategy.retrieve_topk`. Consumed by C2.5 ReRanker."""
frame_id: UUID
candidates: list[VprCandidate] # length == k, sorted ascending by descriptor_distance
retrieved_at: int # monotonic_ns
backbone_label: str # non-empty; matches BUILD_VPR_<variant> lowercase
frame_id: int # echoes the source NavCameraFrame.frame_id
candidates: tuple[VprCandidate, ...] # length == k, ascending by descriptor_distance. tuple (not list) so the frozen+slots invariant holds.
retrieved_at: int # monotonic_ns from injected Clock
backbone_label: str # non-empty; matches BUILD_VPR_<variant> lowercase
```
### Protocol: `BackbonePreprocessor` (C2-internal; lives in `c2_vpr/_preprocessor.py`)
@@ -155,18 +153,21 @@ class IndexUnavailableError(VprError):
# src/gps_denied_onboard/runtime_root/vpr_factory.py
from typing import TYPE_CHECKING
from gps_denied_onboard.config import Config
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.components.c2_vpr import VprStrategy
from gps_denied_onboard.components.c6_tile_cache import TileStore
from gps_denied_onboard.components.c6_tile_cache import DescriptorIndex
from gps_denied_onboard.components.c7_inference import InferenceRuntime
def build_vpr_strategy(
config: Config,
tile_store: TileStore,
*,
descriptor_index: DescriptorIndex,
inference_runtime: InferenceRuntime,
) -> VprStrategy:
"""Composition-root factory. Reads `config.vpr.strategy` and `config.vpr.backbone_weights_path`; lazy-imports the concrete strategy module gated by its CMake `BUILD_VPR_<variant>` flag; refuses to instantiate a strategy whose flag is OFF (raises `ConfigurationError` pointing at the offending strategy name + missing flag).
"""Composition-root factory. Reads `config.components['c2_vpr'].strategy` and `config.components['c2_vpr'].backbone_weights_path`; lazy-imports the concrete strategy module gated by its CMake `BUILD_VPR_<variant>` flag; refuses to instantiate a strategy whose flag is OFF (raises `StrategyNotAvailableError` pointing at the offending strategy name + missing flag).
`descriptor_index` (NOT `tile_store`) is injected: the pre-flight `descriptor_dim` validation reads from the C6 `DescriptorIndex.descriptor_dim()` which is the Public API that owns the FAISS index sidecar. The contract draft earlier named this parameter `tile_store`; the implementation moved it to match C6's actual Public API.
Strategy resolution table:
@@ -180,9 +181,9 @@ def build_vpr_strategy(
| "eigen_places" | EigenPlacesStrategy | components.c2_vpr.eigen_places | BUILD_VPR_EIGENPLACES |
| "salad" | SaladStrategy | components.c2_vpr.salad | BUILD_VPR_SALAD |
Pre-flight validation: after constructing the strategy, the factory queries `strategy.descriptor_dim()` and asserts it matches the C6 corpus index's declared `descriptor_dim` (read from the FAISS index sidecar). Mismatch → `ConfigurationError` at startup, NOT at first frame.
Pre-flight validation: after constructing the strategy, the factory queries `strategy.descriptor_dim()` and asserts it matches `descriptor_index.descriptor_dim()` (the FAISS index sidecar value). Mismatch → `ConfigError` at startup, NOT at first frame.
Returns a fully-constructed strategy ready for `embed_query` / `retrieve_topk` invocation. The caller (runtime root) is responsible for binding the instance to one ingest thread.
Returns a fully-constructed strategy ready for `embed_query` / `retrieve_topk` invocation. The caller (runtime root) is responsible for binding the instance to one ingest thread (AC-9 deferred until the generic compose_root thread-binding registry is in place; see task spec Risk 4).
"""
...
```