mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 09:21:12 +00:00
[AZ-338] [AZ-283] C2 NetVLAD mandatory simple-baseline VprStrategy
NetVLAD is the C2 comparative baseline per the engine rule (every production-default backbone ships with a simple-baseline alongside). Runs on the C7 PyTorch FP16 runtime (NOT TRT) so a TRT engine compile bug cannot simultaneously break NetVLAD AND UltraVPR. Production changes: - c2_vpr/net_vlad.py — NetVladStrategy + module-level create() factory. Constructor wires InferenceRuntimeCut + DescriptorIndexCut + NetVladBackbonePreprocessor + DescriptorNormaliser + FaissBridge. embed_query pipeline: preprocess -> runtime.infer -> dual-stage normalisation (intra-cluster THEN global L2) -> VprQuery. retrieve_topk delegates one-line to FaissBridge. - c2_vpr/_net_vlad_architecture.py — Arandjelovic et al. 2016 NetVLAD layer over torchvision VGG16 features + optional Linear PCA projection to descriptor_dim (default 4096; published Pittsburgh reference uses K*D=64*512=32768 raw + Linear(32768, 4096) PCA). - c2_vpr/_preprocessor_net_vlad.py — OpenCV-based image preprocessor: decode -> centre-crop square -> resize (480, 480) -> ImageNet normalisation -> FP16 NCHW. Calibration is not consumed (NetVLAD is calibration-agnostic per published preprocessing chain). - c2_vpr/inference_runtime_cut.py — NEW AZ-507 consumer-side cut mirroring C7 InferenceRuntime; lets c2_vpr stay AZ-507-clean. - c2_vpr/config.py — added netvlad_descriptor_dim: int = 4096 knob. - helpers/descriptor_normaliser.py — added intra_cluster_normalise (DescriptorNormaliser v1.0.0 -> v1.1.0; backward-compatible add). - runtime_root/vpr_factory.py — added _register_strategy_architecture helper that binds (MODEL_NAME, architecture_factory(descriptor_dim)) to C7's architecture registry before delegating to the strategy's create() factory. Keeps the c7 import at L4, preserves AZ-507. - fdr_client/records.py — registered vpr.embed_query, vpr.backbone_error, vpr.preprocess_error record kinds. Tests: - tests/unit/c2_vpr/test_net_vlad.py — 31 tests covering all 11 ACs + preprocessor contract + architecture factory + constructor validation + FDR record emission. - tests/unit/test_az283_descriptor_normaliser.py — +8 tests for the new intra_cluster_normalise. - tests/unit/test_az272_fdr_record_schema.py — +3 fixture payloads. Full unit suite: 1608 passed / 80 env-skipped (+43 new tests). Per-batch code review (batch_46_review.md): PASS_WITH_WARNINGS (4 Low-severity hygiene findings; no Critical/High/Medium). Architectural notes: - The spec implied c2_vpr.net_vlad.create() registers the architecture with C7. That violates AZ-507 (no cross-component imports). Resolved by exposing MODEL_NAME + architecture_factory(descriptor_dim) on the strategy module and having the composition root perform the C7 bind. - C7 PyTorch runtime API names in the spec (forward, load_engine) were outdated; aligned implementation with the live v1.0.0 Protocol (infer, compile_engine + deserialize_engine). Spec hygiene flagged in review F2. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -37,6 +37,9 @@ from gps_denied_onboard.components.c2_vpr.errors import (
|
||||
VprError,
|
||||
VprPreprocessError,
|
||||
)
|
||||
from gps_denied_onboard.components.c2_vpr.inference_runtime_cut import (
|
||||
InferenceRuntimeCut,
|
||||
)
|
||||
from gps_denied_onboard.components.c2_vpr.interface import VprStrategy
|
||||
from gps_denied_onboard.config.schema import register_component_block
|
||||
|
||||
@@ -46,6 +49,7 @@ __all__ = [
|
||||
"C2VprConfig",
|
||||
"DescriptorIndexCut",
|
||||
"IndexUnavailableError",
|
||||
"InferenceRuntimeCut",
|
||||
"TileIdTuple",
|
||||
"VprBackboneError",
|
||||
"VprCandidate",
|
||||
|
||||
@@ -0,0 +1,144 @@
|
||||
"""NetVLAD VGG16 architecture (AZ-338).
|
||||
|
||||
Reference: Arandjelović, R., Gronat, P., Torii, A., Pajdla, T., & Sivic, J.
|
||||
"NetVLAD: CNN architecture for weakly supervised place recognition", CVPR
|
||||
2016. The architecture is the canonical VPR baseline.
|
||||
|
||||
The architecture has three parts:
|
||||
|
||||
1. **VGG16 trunk** — ``torchvision.models.vgg16`` feature extractor up to
|
||||
the ``conv5_3`` layer (last conv before the classifier). Output is a
|
||||
``(B, encoder_dim=512, H', W')`` feature map.
|
||||
2. **NetVLAD pooling layer** — implements the differentiable VLAD
|
||||
aggregation: soft cluster assignment (1x1 conv + softmax over K)
|
||||
times residuals against K learned cluster centres, summed per cluster
|
||||
to produce a ``(B, K, D)`` aggregated descriptor, then flattened to
|
||||
``(B, K*D,)``.
|
||||
3. **PCA projection (optional)** — a learned ``nn.Linear(K*D, descriptor_dim)``
|
||||
that whitens / reduces the raw VLAD descriptor to the deployment-pinned
|
||||
output dim. Per the published Pittsburgh NetVLAD code drop, the default
|
||||
pinned dim is ``4096`` (whitened from ``K*D = 64*512 = 32768``). When
|
||||
``descriptor_dim == K*D`` the PCA layer is omitted and the raw VLAD is
|
||||
returned unchanged.
|
||||
|
||||
The module is registered into the C7 ``architecture_registry`` (AZ-300)
|
||||
under the ``"net_vlad"`` key by the strategy's ``create(...)`` factory in
|
||||
:mod:`net_vlad`. Registration time is composition time — the strategy
|
||||
constructs a closure carrying the config-driven ``descriptor_dim`` and
|
||||
registers it. The C7 registry stays torch-free; torch / torchvision are
|
||||
imported lazily inside the factory.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import TYPE_CHECKING, Final
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from torch import nn
|
||||
|
||||
__all__ = [
|
||||
"DEFAULT_DESCRIPTOR_DIM",
|
||||
"DEFAULT_ENCODER_DIM",
|
||||
"DEFAULT_NUM_CLUSTERS",
|
||||
"make_net_vlad_vgg16",
|
||||
]
|
||||
|
||||
DEFAULT_ENCODER_DIM: Final[int] = 512
|
||||
DEFAULT_NUM_CLUSTERS: Final[int] = 64
|
||||
DEFAULT_DESCRIPTOR_DIM: Final[int] = 4096
|
||||
|
||||
|
||||
def make_net_vlad_vgg16(
|
||||
*,
|
||||
num_clusters: int = DEFAULT_NUM_CLUSTERS,
|
||||
encoder_dim: int = DEFAULT_ENCODER_DIM,
|
||||
descriptor_dim: int = DEFAULT_DESCRIPTOR_DIM,
|
||||
) -> nn.Module:
|
||||
"""Construct a fresh, randomly-initialised NetVLAD-VGG16 module.
|
||||
|
||||
``descriptor_dim == num_clusters * encoder_dim`` skips the PCA
|
||||
projection; any other value adds an ``nn.Linear(K*D, descriptor_dim)``
|
||||
final layer (the published NetVLAD reference's "WPCA + L2" tail).
|
||||
|
||||
Torch / torchvision are imported here, not at module load — keeping
|
||||
the c2_vpr package free of torch on Tier-0 builds. Callers seeking
|
||||
deterministic weights MUST seed ``torch.manual_seed`` before
|
||||
invocation.
|
||||
"""
|
||||
if num_clusters < 1 or encoder_dim < 1 or descriptor_dim < 1:
|
||||
raise ValueError(
|
||||
f"make_net_vlad_vgg16: dimensions must be positive; "
|
||||
f"got num_clusters={num_clusters}, encoder_dim={encoder_dim}, "
|
||||
f"descriptor_dim={descriptor_dim}"
|
||||
)
|
||||
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
import torchvision
|
||||
|
||||
class _NetVladLayer(nn.Module):
|
||||
"""Differentiable VLAD aggregation (Arandjelović et al. 2016).
|
||||
|
||||
Soft assignment: ``soft_assign = softmax(conv1x1(x))`` over K
|
||||
clusters. Residuals: ``r_ijk = x_ij - c_k``. Output:
|
||||
``v_k = sum_ij(soft_assign[k,i,j] * r_ijk)``. Flattened to a
|
||||
single 1-D vector per batch.
|
||||
"""
|
||||
|
||||
def __init__(self, num_clusters: int, encoder_dim: int) -> None:
|
||||
super().__init__()
|
||||
self.num_clusters = num_clusters
|
||||
self.encoder_dim = encoder_dim
|
||||
self.conv = nn.Conv2d(
|
||||
encoder_dim, num_clusters, kernel_size=(1, 1), bias=True
|
||||
)
|
||||
self.centroids = nn.Parameter(
|
||||
torch.randn(num_clusters, encoder_dim) * 0.01
|
||||
)
|
||||
|
||||
def forward(self, features: torch.Tensor) -> torch.Tensor:
|
||||
n_batch, n_channels = features.shape[0], features.shape[1]
|
||||
soft_assign = self.conv(features).view(n_batch, self.num_clusters, -1)
|
||||
soft_assign = F.softmax(soft_assign, dim=1)
|
||||
flat = features.view(n_batch, n_channels, -1)
|
||||
soft_assign_t = soft_assign.transpose(1, 2)
|
||||
assigned = torch.bmm(flat, soft_assign_t)
|
||||
cluster_sum = soft_assign.sum(dim=2)
|
||||
scaled_centroids = self.centroids.unsqueeze(0) * cluster_sum.unsqueeze(2)
|
||||
vlad = assigned.transpose(1, 2) - scaled_centroids
|
||||
return vlad.reshape(n_batch, -1)
|
||||
|
||||
class _NetVladVgg16(nn.Module):
|
||||
def __init__(
|
||||
self,
|
||||
num_clusters: int,
|
||||
encoder_dim: int,
|
||||
descriptor_dim: int,
|
||||
) -> None:
|
||||
super().__init__()
|
||||
self._raw_dim = num_clusters * encoder_dim
|
||||
vgg = torchvision.models.vgg16(weights=None)
|
||||
self.encoder = nn.Sequential(*list(vgg.features.children())[:-2])
|
||||
self.pool = _NetVladLayer(num_clusters, encoder_dim)
|
||||
if descriptor_dim == self._raw_dim:
|
||||
self.pca: nn.Module | None = None
|
||||
else:
|
||||
self.pca = nn.Linear(self._raw_dim, descriptor_dim, bias=True)
|
||||
|
||||
def forward(
|
||||
self, input: torch.Tensor
|
||||
) -> dict[str, torch.Tensor]:
|
||||
features = self.encoder(input)
|
||||
vlad_raw = self.pool(features)
|
||||
if self.pca is not None:
|
||||
vlad_descriptor = self.pca(vlad_raw)
|
||||
else:
|
||||
vlad_descriptor = vlad_raw
|
||||
return {"vlad_descriptor": vlad_descriptor}
|
||||
|
||||
return _NetVladVgg16(
|
||||
num_clusters=num_clusters,
|
||||
encoder_dim=encoder_dim,
|
||||
descriptor_dim=descriptor_dim,
|
||||
)
|
||||
@@ -0,0 +1,137 @@
|
||||
"""NetVLAD-VGG16 backbone preprocessor (AZ-338).
|
||||
|
||||
Per AZ-338 § Outcome: NetVLAD's published preprocessing chain decodes the
|
||||
nav-camera frame's image to RGB uint8, centre-crops to a square region
|
||||
respecting the camera calibration, resizes to ``(480, 480)``, applies
|
||||
ImageNet mean/std normalisation, casts to FP16, and reshapes to NCHW.
|
||||
|
||||
This preprocessor is C2-internal and owned exclusively by
|
||||
:class:`NetVladStrategy` — UltraVPR and the other backbones each ship
|
||||
their own concrete preprocessor (description.md § 6 forbids sharing).
|
||||
|
||||
The :class:`BackbonePreprocessor` Protocol is mirrored here (the
|
||||
strategy module imports the concrete preprocessor and constructs it in
|
||||
the ``create(...)`` factory; the Protocol lives in
|
||||
:mod:`c2_vpr._preprocessor`).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import TYPE_CHECKING, Final
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from gps_denied_onboard.components.c2_vpr.errors import VprPreprocessError
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from gps_denied_onboard._types.calibration import CameraCalibration
|
||||
from gps_denied_onboard._types.nav import NavCameraFrame
|
||||
|
||||
__all__ = [
|
||||
"IMAGENET_MEAN",
|
||||
"IMAGENET_STD",
|
||||
"NETVLAD_INPUT_HW",
|
||||
"NetVladBackbonePreprocessor",
|
||||
]
|
||||
|
||||
NETVLAD_INPUT_HW: Final[tuple[int, int]] = (480, 480)
|
||||
IMAGENET_MEAN: Final[tuple[float, float, float]] = (0.485, 0.456, 0.406)
|
||||
IMAGENET_STD: Final[tuple[float, float, float]] = (0.229, 0.224, 0.225)
|
||||
|
||||
|
||||
class NetVladBackbonePreprocessor:
|
||||
"""Resize + ImageNet-normalise + FP16-NCHW for NetVLAD-VGG16."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
input_shape: tuple[int, int] = NETVLAD_INPUT_HW,
|
||||
mean: tuple[float, float, float] = IMAGENET_MEAN,
|
||||
std: tuple[float, float, float] = IMAGENET_STD,
|
||||
) -> None:
|
||||
if (
|
||||
not isinstance(input_shape, tuple)
|
||||
or len(input_shape) != 2
|
||||
or any(not isinstance(v, int) or v <= 0 for v in input_shape)
|
||||
):
|
||||
raise ValueError(
|
||||
f"NetVladBackbonePreprocessor.input_shape must be a (H, W) "
|
||||
f"tuple of positive ints; got {input_shape!r}"
|
||||
)
|
||||
if len(mean) != 3 or len(std) != 3:
|
||||
raise ValueError(
|
||||
"NetVladBackbonePreprocessor.mean and std must each be "
|
||||
"3-tuples (one per channel)"
|
||||
)
|
||||
if any(v <= 0 for v in std):
|
||||
raise ValueError(
|
||||
"NetVladBackbonePreprocessor.std components must be > 0"
|
||||
)
|
||||
self._input_shape: tuple[int, int] = input_shape
|
||||
self._mean: np.ndarray = np.array(mean, dtype=np.float32).reshape(1, 1, 3)
|
||||
self._std: np.ndarray = np.array(std, dtype=np.float32).reshape(1, 1, 3)
|
||||
|
||||
def preprocess(
|
||||
self,
|
||||
frame: NavCameraFrame,
|
||||
calibration: CameraCalibration,
|
||||
) -> np.ndarray:
|
||||
"""Decode → centre-crop → resize → normalise → FP16 NCHW.
|
||||
|
||||
``calibration`` is accepted for Protocol conformance but is not
|
||||
consumed here — NetVLAD's published preprocessing chain does not
|
||||
use principal-point or distortion correction (the backbone is
|
||||
trained on ImageNet-style centre-cropped frames; calibration
|
||||
differences are absorbed into the learned VLAD residuals).
|
||||
UltraVPR's preprocessor uses calibration; this one does not.
|
||||
|
||||
Raises:
|
||||
:class:`VprPreprocessError` on shape / dtype / decode
|
||||
violations.
|
||||
"""
|
||||
del calibration
|
||||
image = self._coerce_to_rgb_uint8(frame.image)
|
||||
cropped = self._centre_crop_square(image)
|
||||
try:
|
||||
resized = cv2.resize(
|
||||
cropped, self._input_shape[::-1], interpolation=cv2.INTER_AREA
|
||||
)
|
||||
except cv2.error as exc:
|
||||
raise VprPreprocessError(
|
||||
f"cv2.resize failed: {type(exc).__name__}: {exc}"
|
||||
) from exc
|
||||
as_f32 = resized.astype(np.float32) / 255.0
|
||||
normalised = (as_f32 - self._mean) / self._std
|
||||
chw = normalised.transpose(2, 0, 1)
|
||||
return np.ascontiguousarray(chw[None, :, :, :], dtype=np.float16)
|
||||
|
||||
def input_shape(self) -> tuple[int, int]:
|
||||
return self._input_shape
|
||||
|
||||
@staticmethod
|
||||
def _coerce_to_rgb_uint8(image: object) -> np.ndarray:
|
||||
if not isinstance(image, np.ndarray):
|
||||
raise VprPreprocessError(
|
||||
f"frame.image must be a numpy array; got {type(image).__name__}"
|
||||
)
|
||||
if image.dtype != np.uint8:
|
||||
raise VprPreprocessError(
|
||||
f"frame.image must be uint8 RGB; got dtype {image.dtype}"
|
||||
)
|
||||
if image.ndim == 2:
|
||||
# Grayscale → 3-channel by repeating
|
||||
return np.stack([image, image, image], axis=-1)
|
||||
if image.ndim == 3 and image.shape[2] == 3:
|
||||
return image
|
||||
raise VprPreprocessError(
|
||||
f"frame.image must be (H,W) or (H,W,3); got shape {image.shape}"
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def _centre_crop_square(image: np.ndarray) -> np.ndarray:
|
||||
h, w = image.shape[:2]
|
||||
side = min(h, w)
|
||||
top = (h - side) // 2
|
||||
left = (w - side) // 2
|
||||
return image[top : top + side, left : left + side, :]
|
||||
@@ -21,8 +21,8 @@ from typing import Final
|
||||
from gps_denied_onboard.config.schema import ConfigError
|
||||
|
||||
__all__ = [
|
||||
"C2VprConfig",
|
||||
"KNOWN_STRATEGIES",
|
||||
"C2VprConfig",
|
||||
]
|
||||
|
||||
KNOWN_STRATEGIES: Final[frozenset[str]] = frozenset(
|
||||
@@ -69,6 +69,7 @@ class C2VprConfig:
|
||||
faiss_index_path: Path = field(default_factory=lambda: Path("/cache/vpr/index.faiss"))
|
||||
warn_top1_threshold: float = 0.30
|
||||
debug_per_frame_distances: bool = False
|
||||
netvlad_descriptor_dim: int = 4096
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
if self.strategy not in KNOWN_STRATEGIES:
|
||||
@@ -109,3 +110,15 @@ class C2VprConfig:
|
||||
f"C2VprConfig.debug_per_frame_distances must be a bool; "
|
||||
f"got {self.debug_per_frame_distances!r}"
|
||||
)
|
||||
if not isinstance(self.netvlad_descriptor_dim, int) or isinstance(
|
||||
self.netvlad_descriptor_dim, bool
|
||||
):
|
||||
raise ConfigError(
|
||||
f"C2VprConfig.netvlad_descriptor_dim must be a non-bool "
|
||||
f"int; got {self.netvlad_descriptor_dim!r}"
|
||||
)
|
||||
if self.netvlad_descriptor_dim < 1:
|
||||
raise ConfigError(
|
||||
f"C2VprConfig.netvlad_descriptor_dim must be >= 1; "
|
||||
f"got {self.netvlad_descriptor_dim}"
|
||||
)
|
||||
|
||||
@@ -0,0 +1,60 @@
|
||||
"""C2's structural cut of C7 ``InferenceRuntime`` (AZ-507).
|
||||
|
||||
Concrete C2 ``VprStrategy`` impls call into C7's inference runtime to
|
||||
load engine handles and run forward passes. Per AZ-507, ``c2_vpr`` MUST
|
||||
NOT import ``components.c7_inference`` directly; the consumer-side cut
|
||||
declares the structural Protocol surface that c2 actually uses, and the
|
||||
composition root binds the c7 runtime as the concrete implementation.
|
||||
|
||||
This Protocol mirrors the subset of
|
||||
:class:`gps_denied_onboard.components.c7_inference.InferenceRuntime`
|
||||
that the C2 strategies consume — ``compile_engine``,
|
||||
``deserialize_engine``, ``infer``, ``release_engine``, and
|
||||
``current_runtime_label``. The full Protocol (which adds
|
||||
``thermal_state``) is wider; the cut narrows to what C2 needs so
|
||||
``isinstance(runtime, InferenceRuntimeCut)`` can be enforced without
|
||||
demanding the wider surface.
|
||||
|
||||
DTOs (``BuildConfig``, ``EngineHandle``, ``EngineCacheEntry``) live in
|
||||
:mod:`gps_denied_onboard._types.inference` (L1) and are imported here
|
||||
directly — they are L1 shared types, not cross-component imports.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING, Literal, Protocol, runtime_checkable
|
||||
|
||||
from gps_denied_onboard._types.inference import (
|
||||
BuildConfig,
|
||||
EngineCacheEntry,
|
||||
EngineHandle,
|
||||
)
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import numpy as np
|
||||
|
||||
__all__ = ["InferenceRuntimeCut"]
|
||||
|
||||
|
||||
@runtime_checkable
|
||||
class InferenceRuntimeCut(Protocol):
|
||||
"""Subset of C7 ``InferenceRuntime`` consumed by C2 strategies."""
|
||||
|
||||
def compile_engine(
|
||||
self, model_path: Path, build_config: BuildConfig
|
||||
) -> EngineCacheEntry: ...
|
||||
|
||||
def deserialize_engine(self, entry: EngineCacheEntry) -> EngineHandle: ...
|
||||
|
||||
def infer(
|
||||
self,
|
||||
handle: EngineHandle,
|
||||
inputs: dict[str, np.ndarray],
|
||||
) -> dict[str, np.ndarray]: ...
|
||||
|
||||
def release_engine(self, handle: EngineHandle) -> None: ...
|
||||
|
||||
def current_runtime_label(
|
||||
self,
|
||||
) -> Literal["tensorrt", "onnx_trt_ep", "pytorch_fp16"]: ...
|
||||
@@ -0,0 +1,521 @@
|
||||
"""``NetVladStrategy`` — C2 mandatory simple-baseline VprStrategy (AZ-338).
|
||||
|
||||
NetVLAD is the C2 comparative baseline mandated by the engine rule (every
|
||||
production-default backbone ships with a simpler baseline alongside, so
|
||||
a code-drop / weights / engine compile bug in the primary has a
|
||||
fallback at the strategy layer). Per ``components/02_c2_vpr/description.md``
|
||||
§ 1, NetVLAD is paired with UltraVPR's primary path; per § 5, NetVLAD
|
||||
runs on the C7 PyTorch FP16 runtime (NOT TensorRT) so a TRT engine
|
||||
issue cannot simultaneously break both.
|
||||
|
||||
The strategy delegates retrieval to :class:`FaissBridge` (AZ-341) and
|
||||
the c6 ``DescriptorIndex`` cut (AZ-507) — see
|
||||
:mod:`gps_denied_onboard.components.c2_vpr._faiss_bridge`. Embedding
|
||||
goes through the c7 :class:`InferenceRuntime` Protocol; the architecture
|
||||
is registered into c7's architecture registry by this module's
|
||||
``create(...)`` factory.
|
||||
|
||||
Architecture loading flow:
|
||||
|
||||
1. ``create(config, descriptor_index, inference_runtime)`` is called by
|
||||
the composition-root :func:`build_vpr_strategy`.
|
||||
2. Build-flag guard: confirms ``inference_runtime.current_runtime_label()
|
||||
== "pytorch_fp16"`` — fails fast with :class:`ConfigError` otherwise
|
||||
(AC-11: airborne binary has the PyTorch runtime excluded so this
|
||||
surfaces at composition time, not at first frame).
|
||||
3. The factory binds ``descriptor_dim`` from
|
||||
``config.c2_vpr.netvlad_descriptor_dim`` into a closure and registers
|
||||
that closure with c7's architecture registry under ``"net_vlad"``.
|
||||
The closure is the zero-arg ``ArchitectureFactory`` callable shape
|
||||
the registry expects.
|
||||
4. ``inference_runtime.compile_engine(weights_path, build_config)`` is
|
||||
called with ``BuildConfig(precision=FP16, ...)``; the PyTorch runtime
|
||||
(AZ-300) returns an :class:`EngineCacheEntry` whose
|
||||
``extras["model_name"]`` is the checkpoint's file stem. The factory
|
||||
forces ``model_name = "net_vlad"`` so the registered factory is
|
||||
selected at :meth:`deserialize_engine` time.
|
||||
5. ``inference_runtime.deserialize_engine(entry)`` returns an
|
||||
:class:`EngineHandle`. The factory then queries the architecture's
|
||||
output shape via a single dry-run inference on a zero-init input,
|
||||
compares against the configured ``descriptor_dim``, and raises
|
||||
:class:`ConfigError` on mismatch (AC-7) BEFORE the strategy is
|
||||
bound.
|
||||
6. ``NetVladStrategy`` is constructed with the resolved handle + the
|
||||
:class:`FaissBridge` + :class:`NetVladBackbonePreprocessor` +
|
||||
:class:`DescriptorNormaliser`.
|
||||
|
||||
Per-frame :meth:`embed_query` pipeline:
|
||||
|
||||
1. ``preprocessor.preprocess(frame, calibration)`` → ``(1, 3, 480, 480)``
|
||||
FP16 NCHW ndarray.
|
||||
2. ``inference_runtime.infer(handle, {"input": tensor})`` →
|
||||
``{"vlad_descriptor": (1, descriptor_dim) FP16 ndarray}``.
|
||||
3. ``normaliser.intra_cluster_normalise(intermediate, num_clusters=64)``
|
||||
→ per-cluster L2 (the NetVLAD-canonical first stage).
|
||||
4. ``normaliser.l2_normalise(intra)`` → global L2 (the second stage).
|
||||
5. Return :class:`VprQuery` with ``frame_id``, normalised embedding,
|
||||
produced_at monotonic ns.
|
||||
|
||||
Error envelope: every method raises only members of :class:`VprError`.
|
||||
``RuntimeError`` from the backbone forward → rewrapped to
|
||||
:class:`VprBackboneError`; :class:`VprPreprocessError` from the
|
||||
preprocessor propagates unchanged.
|
||||
|
||||
Retrieval is a single-line delegation to :class:`FaissBridge.retrieve`;
|
||||
see AZ-341 AC-10.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING, Final, Literal
|
||||
|
||||
import numpy as np
|
||||
|
||||
from gps_denied_onboard._types.inference import (
|
||||
BuildConfig,
|
||||
EngineHandle,
|
||||
PrecisionMode,
|
||||
)
|
||||
from gps_denied_onboard._types.vpr import VprQuery, VprResult
|
||||
from gps_denied_onboard.clock import Clock
|
||||
from gps_denied_onboard.components.c2_vpr._faiss_bridge import FaissBridge
|
||||
from gps_denied_onboard.components.c2_vpr._net_vlad_architecture import (
|
||||
DEFAULT_NUM_CLUSTERS,
|
||||
make_net_vlad_vgg16,
|
||||
)
|
||||
from gps_denied_onboard.components.c2_vpr._preprocessor_net_vlad import (
|
||||
NetVladBackbonePreprocessor,
|
||||
)
|
||||
from gps_denied_onboard.components.c2_vpr.descriptor_index_cut import (
|
||||
DescriptorIndexCut,
|
||||
)
|
||||
from gps_denied_onboard.components.c2_vpr.errors import (
|
||||
VprBackboneError,
|
||||
VprPreprocessError,
|
||||
)
|
||||
from gps_denied_onboard.components.c2_vpr.inference_runtime_cut import (
|
||||
InferenceRuntimeCut,
|
||||
)
|
||||
from gps_denied_onboard.config.schema import ConfigError
|
||||
from gps_denied_onboard.fdr_client import EnqueueResult, FdrClient
|
||||
from gps_denied_onboard.fdr_client.records import (
|
||||
CURRENT_SCHEMA_VERSION,
|
||||
FdrRecord,
|
||||
)
|
||||
from gps_denied_onboard.helpers.descriptor_normaliser import DescriptorNormaliser
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from gps_denied_onboard._types.calibration import CameraCalibration
|
||||
from gps_denied_onboard._types.nav import NavCameraFrame
|
||||
from gps_denied_onboard.config.schema import Config
|
||||
|
||||
__all__ = ["MODEL_NAME", "NetVladStrategy", "architecture_factory", "create"]
|
||||
|
||||
|
||||
MODEL_NAME: Final[str] = "net_vlad"
|
||||
|
||||
|
||||
def architecture_factory(
|
||||
descriptor_dim: int,
|
||||
*,
|
||||
num_clusters: int = DEFAULT_NUM_CLUSTERS,
|
||||
):
|
||||
"""Zero-arg architecture factory closure for C7 registry binding.
|
||||
|
||||
The composition root calls this with the configured ``descriptor_dim``
|
||||
and registers the returned closure under :data:`MODEL_NAME` in C7's
|
||||
architecture registry. Keeping the registration step in the
|
||||
composition root preserves the AZ-507 layering: ``c2_vpr`` MUST NOT
|
||||
import ``c7_inference``.
|
||||
"""
|
||||
if descriptor_dim < 1:
|
||||
raise ValueError(
|
||||
f"architecture_factory: descriptor_dim must be >= 1; "
|
||||
f"got {descriptor_dim}"
|
||||
)
|
||||
|
||||
def _factory():
|
||||
return make_net_vlad_vgg16(
|
||||
num_clusters=num_clusters, descriptor_dim=descriptor_dim
|
||||
)
|
||||
|
||||
return _factory
|
||||
|
||||
|
||||
_BACKBONE_LABEL: Final[Literal["net_vlad"]] = "net_vlad"
|
||||
_COMPONENT: Final[str] = "c2_vpr"
|
||||
|
||||
_LOG_KIND_READY: Final[str] = "c2.vpr.ready"
|
||||
_LOG_KIND_BACKBONE_ERROR: Final[str] = "c2.vpr.backbone_error"
|
||||
_LOG_KIND_PREPROCESS_ERROR: Final[str] = "c2.vpr.preprocess_error"
|
||||
_LOG_KIND_FDR_OVERRUN: Final[str] = "c2.vpr.fdr_overrun"
|
||||
|
||||
_FDR_KIND_EMBED: Final[str] = "vpr.embed_query"
|
||||
_FDR_KIND_BACKBONE_ERROR: Final[str] = "vpr.backbone_error"
|
||||
_FDR_KIND_PREPROCESS_ERROR: Final[str] = "vpr.preprocess_error"
|
||||
|
||||
|
||||
class NetVladStrategy:
|
||||
"""C2 mandatory simple-baseline VprStrategy.
|
||||
|
||||
See module docstring for the architecture-loading + per-frame
|
||||
pipeline. Stateless across frames (INV-2); single-threaded per
|
||||
instance (INV-1).
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
inference_runtime: InferenceRuntimeCut,
|
||||
engine_handle: EngineHandle,
|
||||
descriptor_index: DescriptorIndexCut,
|
||||
preprocessor: NetVladBackbonePreprocessor,
|
||||
normaliser: DescriptorNormaliser,
|
||||
faiss_bridge: FaissBridge,
|
||||
fdr_client: FdrClient,
|
||||
clock: Clock,
|
||||
logger: logging.Logger,
|
||||
descriptor_dim: int,
|
||||
num_clusters: int = DEFAULT_NUM_CLUSTERS,
|
||||
) -> None:
|
||||
if descriptor_dim < 1:
|
||||
raise ValueError(
|
||||
f"NetVladStrategy.descriptor_dim must be >= 1; "
|
||||
f"got {descriptor_dim}"
|
||||
)
|
||||
if num_clusters < 1:
|
||||
raise ValueError(
|
||||
f"NetVladStrategy.num_clusters must be >= 1; "
|
||||
f"got {num_clusters}"
|
||||
)
|
||||
if descriptor_dim % num_clusters != 0:
|
||||
raise ValueError(
|
||||
f"NetVladStrategy: descriptor_dim={descriptor_dim} must be "
|
||||
f"divisible by num_clusters={num_clusters} for intra-cluster "
|
||||
f"normalisation"
|
||||
)
|
||||
self._inference_runtime = inference_runtime
|
||||
self._engine_handle = engine_handle
|
||||
self._descriptor_index = descriptor_index
|
||||
self._preprocessor = preprocessor
|
||||
self._normaliser = normaliser
|
||||
self._faiss_bridge = faiss_bridge
|
||||
self._fdr_client = fdr_client
|
||||
self._clock = clock
|
||||
self._logger = logger
|
||||
self._descriptor_dim = descriptor_dim
|
||||
self._num_clusters = num_clusters
|
||||
|
||||
def embed_query(
|
||||
self,
|
||||
frame: NavCameraFrame,
|
||||
calibration: CameraCalibration,
|
||||
) -> VprQuery:
|
||||
try:
|
||||
tensor = self._preprocessor.preprocess(frame, calibration)
|
||||
except VprPreprocessError as exc:
|
||||
self._emit_preprocess_error(frame, exc)
|
||||
raise
|
||||
|
||||
ns_start = self._clock.monotonic_ns()
|
||||
try:
|
||||
outputs = self._inference_runtime.infer(
|
||||
self._engine_handle, {"input": tensor}
|
||||
)
|
||||
except Exception as exc:
|
||||
wrapped = self._wrap_backbone_error(frame, exc)
|
||||
raise wrapped from exc
|
||||
ns_end = self._clock.monotonic_ns()
|
||||
latency_us = max(1, (ns_end - ns_start) // 1_000)
|
||||
|
||||
if "vlad_descriptor" not in outputs:
|
||||
err = VprBackboneError(
|
||||
f"NetVLAD forward returned no 'vlad_descriptor' key; "
|
||||
f"got {sorted(outputs.keys())!r}"
|
||||
)
|
||||
self._emit_backbone_error(frame, err)
|
||||
raise err
|
||||
|
||||
raw = np.asarray(outputs["vlad_descriptor"])
|
||||
if raw.ndim != 2 or raw.shape[0] != 1 or raw.shape[1] != self._descriptor_dim:
|
||||
err = VprBackboneError(
|
||||
f"NetVLAD forward returned shape {raw.shape}; "
|
||||
f"expected (1, {self._descriptor_dim})"
|
||||
)
|
||||
self._emit_backbone_error(frame, err)
|
||||
raise err
|
||||
|
||||
flat = np.ascontiguousarray(raw[0], dtype=np.float16)
|
||||
intra = self._normaliser.intra_cluster_normalise(
|
||||
flat, num_clusters=self._num_clusters
|
||||
)
|
||||
normalised = self._normaliser.l2_normalise(intra)
|
||||
|
||||
self._emit_embed_record(
|
||||
frame_id=int(frame.frame_id), latency_us=int(latency_us)
|
||||
)
|
||||
|
||||
return VprQuery(
|
||||
frame_id=int(frame.frame_id),
|
||||
embedding=normalised,
|
||||
produced_at=ns_end,
|
||||
)
|
||||
|
||||
def retrieve_topk(self, query: VprQuery, k: int) -> VprResult:
|
||||
return self._faiss_bridge.retrieve(
|
||||
query, k, backbone_label=_BACKBONE_LABEL
|
||||
)
|
||||
|
||||
def descriptor_dim(self) -> int:
|
||||
return self._descriptor_dim
|
||||
|
||||
def _wrap_backbone_error(
|
||||
self, frame: NavCameraFrame, exc: BaseException
|
||||
) -> VprBackboneError:
|
||||
wrapped = VprBackboneError(
|
||||
f"NetVLAD forward raised {type(exc).__name__}: {exc}"
|
||||
)
|
||||
self._emit_backbone_error(frame, wrapped)
|
||||
return wrapped
|
||||
|
||||
def _emit_embed_record(self, *, frame_id: int, latency_us: int) -> None:
|
||||
record = FdrRecord(
|
||||
schema_version=CURRENT_SCHEMA_VERSION,
|
||||
ts=_iso_ts_from_clock(self._clock),
|
||||
producer_id=self._fdr_client.producer_id,
|
||||
kind=_FDR_KIND_EMBED,
|
||||
payload={
|
||||
"frame_id": frame_id,
|
||||
"backbone_label": _BACKBONE_LABEL,
|
||||
"descriptor_dim": self._descriptor_dim,
|
||||
"latency_us": latency_us,
|
||||
},
|
||||
)
|
||||
result = self._fdr_client.enqueue(record)
|
||||
if result == EnqueueResult.OVERRUN:
|
||||
self._logger.warning(
|
||||
"FDR enqueue dropped vpr.embed_query record (buffer overrun)",
|
||||
extra={
|
||||
"component": _COMPONENT,
|
||||
"kind": _LOG_KIND_FDR_OVERRUN,
|
||||
"kv": {
|
||||
"frame_id": frame_id,
|
||||
"backbone_label": _BACKBONE_LABEL,
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
def _emit_backbone_error(
|
||||
self, frame: NavCameraFrame, error: BaseException
|
||||
) -> None:
|
||||
frame_id = int(frame.frame_id)
|
||||
msg = f"NetVLAD backbone error: {error}"
|
||||
self._logger.error(
|
||||
msg,
|
||||
extra={
|
||||
"component": _COMPONENT,
|
||||
"kind": _LOG_KIND_BACKBONE_ERROR,
|
||||
"kv": {
|
||||
"frame_id": frame_id,
|
||||
"backbone_label": _BACKBONE_LABEL,
|
||||
"error_type": type(error).__name__,
|
||||
},
|
||||
},
|
||||
)
|
||||
self._fdr_client.enqueue(
|
||||
FdrRecord(
|
||||
schema_version=CURRENT_SCHEMA_VERSION,
|
||||
ts=_iso_ts_from_clock(self._clock),
|
||||
producer_id=self._fdr_client.producer_id,
|
||||
kind=_FDR_KIND_BACKBONE_ERROR,
|
||||
payload={
|
||||
"frame_id": frame_id,
|
||||
"backbone_label": _BACKBONE_LABEL,
|
||||
"error_type": type(error).__name__,
|
||||
"error_message": str(error)[:512],
|
||||
},
|
||||
)
|
||||
)
|
||||
|
||||
def _emit_preprocess_error(
|
||||
self, frame: NavCameraFrame, error: BaseException
|
||||
) -> None:
|
||||
frame_id = int(frame.frame_id)
|
||||
msg = f"NetVLAD preprocess error: {error}"
|
||||
self._logger.error(
|
||||
msg,
|
||||
extra={
|
||||
"component": _COMPONENT,
|
||||
"kind": _LOG_KIND_PREPROCESS_ERROR,
|
||||
"kv": {
|
||||
"frame_id": frame_id,
|
||||
"backbone_label": _BACKBONE_LABEL,
|
||||
"error_type": type(error).__name__,
|
||||
},
|
||||
},
|
||||
)
|
||||
self._fdr_client.enqueue(
|
||||
FdrRecord(
|
||||
schema_version=CURRENT_SCHEMA_VERSION,
|
||||
ts=_iso_ts_from_clock(self._clock),
|
||||
producer_id=self._fdr_client.producer_id,
|
||||
kind=_FDR_KIND_PREPROCESS_ERROR,
|
||||
payload={
|
||||
"frame_id": frame_id,
|
||||
"backbone_label": _BACKBONE_LABEL,
|
||||
"error_type": type(error).__name__,
|
||||
"error_message": str(error)[:512],
|
||||
},
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
def _iso_ts_from_clock(clock: Clock) -> str:
|
||||
# Same shape every component uses for FDR timestamps; AZ-508 will
|
||||
# consolidate the duplicate helpers across c2/c11/c12/c6.
|
||||
from datetime import datetime, timezone
|
||||
|
||||
ns = int(clock.time_ns())
|
||||
seconds, fraction_ns = divmod(ns, 1_000_000_000)
|
||||
dt = datetime.fromtimestamp(seconds, tz=timezone.utc)
|
||||
return f"{dt.strftime('%Y-%m-%dT%H:%M:%S')}.{fraction_ns:09d}+00:00"
|
||||
|
||||
|
||||
def _build_pytorch_build_config(weights_path: Path) -> BuildConfig:
|
||||
del weights_path
|
||||
return BuildConfig(
|
||||
precision=PrecisionMode.FP16,
|
||||
workspace_mb=0,
|
||||
calibration_dataset=None,
|
||||
optimization_profiles=(),
|
||||
)
|
||||
|
||||
|
||||
def create(
|
||||
config: Config,
|
||||
*,
|
||||
descriptor_index: DescriptorIndexCut,
|
||||
inference_runtime: InferenceRuntimeCut,
|
||||
fdr_client: FdrClient | None = None,
|
||||
clock: Clock | None = None,
|
||||
logger: logging.Logger | None = None,
|
||||
) -> NetVladStrategy:
|
||||
"""Module-level factory consumed by :func:`build_vpr_strategy`.
|
||||
|
||||
Prerequisite: the composition root MUST have already registered
|
||||
:func:`architecture_factory` under :data:`MODEL_NAME` in C7's
|
||||
architecture registry before calling this factory. The registration
|
||||
step lives in the composition root to preserve AZ-507 — ``c2_vpr``
|
||||
does not import ``c7_inference``.
|
||||
|
||||
Optional keyword-only injection points (``fdr_client`` / ``clock`` /
|
||||
``logger``) keep tests deterministic; production wiring fills them
|
||||
from the composition root.
|
||||
"""
|
||||
runtime_label = inference_runtime.current_runtime_label()
|
||||
if runtime_label != "pytorch_fp16":
|
||||
raise ConfigError(
|
||||
f"NetVLAD requires BUILD_PYTORCH_RUNTIME=ON; this binary "
|
||||
f"has BUILD_PYTORCH_RUNTIME=OFF (current_runtime_label="
|
||||
f"{runtime_label!r}). Per AZ-338 AC-11, NetVLAD is "
|
||||
f"unselectable when the C7 PyTorch FP16 runtime is "
|
||||
f"excluded."
|
||||
)
|
||||
|
||||
block = config.components["c2_vpr"]
|
||||
descriptor_dim = block.netvlad_descriptor_dim
|
||||
weights_path = block.backbone_weights_path
|
||||
|
||||
if fdr_client is None:
|
||||
raise ValueError(
|
||||
"NetVladStrategy.create: fdr_client is required; the "
|
||||
"composition root must inject the running FDR client."
|
||||
)
|
||||
if clock is None:
|
||||
from gps_denied_onboard.clock.wall_clock import WallClock
|
||||
|
||||
clock = WallClock()
|
||||
if logger is None:
|
||||
logger = logging.getLogger("gps_denied_onboard.c2_vpr.net_vlad")
|
||||
|
||||
entry = inference_runtime.compile_engine(
|
||||
weights_path, _build_pytorch_build_config(weights_path)
|
||||
)
|
||||
# Force the registry lookup to "net_vlad" regardless of the on-disk
|
||||
# filename stem; the registered factory holds the descriptor_dim
|
||||
# closure.
|
||||
entry_for_deserialize = type(entry)(
|
||||
engine_path=entry.engine_path,
|
||||
sha256_hex=entry.sha256_hex,
|
||||
sm=entry.sm,
|
||||
jp=entry.jp,
|
||||
trt=entry.trt,
|
||||
precision=entry.precision,
|
||||
extras={**entry.extras, "model_name": MODEL_NAME},
|
||||
)
|
||||
handle = inference_runtime.deserialize_engine(entry_for_deserialize)
|
||||
|
||||
preprocessor = NetVladBackbonePreprocessor()
|
||||
normaliser = DescriptorNormaliser()
|
||||
faiss_bridge = FaissBridge(
|
||||
descriptor_index=descriptor_index,
|
||||
descriptor_dim=descriptor_dim,
|
||||
warn_top1_threshold=block.warn_top1_threshold,
|
||||
debug_log_per_frame_distances=block.debug_per_frame_distances,
|
||||
fdr_client=fdr_client,
|
||||
logger=logger,
|
||||
clock=clock,
|
||||
)
|
||||
|
||||
_assert_engine_output_dim(
|
||||
inference_runtime, handle, descriptor_dim, preprocessor
|
||||
)
|
||||
|
||||
logger.info(
|
||||
"C2 VPR strategy ready",
|
||||
extra={
|
||||
"component": _COMPONENT,
|
||||
"kind": _LOG_KIND_READY,
|
||||
"kv": {
|
||||
"strategy": _BACKBONE_LABEL,
|
||||
"descriptor_dim": descriptor_dim,
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
return NetVladStrategy(
|
||||
inference_runtime=inference_runtime,
|
||||
engine_handle=handle,
|
||||
descriptor_index=descriptor_index,
|
||||
preprocessor=preprocessor,
|
||||
normaliser=normaliser,
|
||||
faiss_bridge=faiss_bridge,
|
||||
fdr_client=fdr_client,
|
||||
clock=clock,
|
||||
logger=logger,
|
||||
descriptor_dim=descriptor_dim,
|
||||
)
|
||||
|
||||
|
||||
def _assert_engine_output_dim(
|
||||
inference_runtime: InferenceRuntimeCut,
|
||||
handle: EngineHandle,
|
||||
expected_dim: int,
|
||||
preprocessor: NetVladBackbonePreprocessor,
|
||||
) -> None:
|
||||
h, w = preprocessor.input_shape()
|
||||
probe = np.zeros((1, 3, h, w), dtype=np.float16)
|
||||
outputs = inference_runtime.infer(handle, {"input": probe})
|
||||
if "vlad_descriptor" not in outputs:
|
||||
raise ConfigError(
|
||||
f"engine output shape mismatch: 'vlad_descriptor' key absent; "
|
||||
f"got keys {sorted(outputs.keys())!r}"
|
||||
)
|
||||
actual = np.asarray(outputs["vlad_descriptor"])
|
||||
if actual.ndim != 2 or actual.shape[0] != 1 or actual.shape[1] != expected_dim:
|
||||
raise ConfigError(
|
||||
f"engine output shape mismatch: expected (1, {expected_dim}), "
|
||||
f"got {tuple(actual.shape)}"
|
||||
)
|
||||
@@ -314,6 +314,46 @@ KNOWN_PAYLOAD_KEYS: Final[dict[str, frozenset[str]]] = {
|
||||
"latency_us",
|
||||
}
|
||||
),
|
||||
# AZ-338 / E-C2: emitted by each concrete C2 ``VprStrategy`` on every
|
||||
# successful ``embed_query(...)`` call (post-flight forensic record
|
||||
# of the embedding pipeline; complements ``vpr.retrieve_topk`` for the
|
||||
# subsequent FAISS lookup). ``frame_id`` echoes
|
||||
# ``NavCameraFrame.frame_id``; ``backbone_label`` is the strategy's
|
||||
# lowercase ``BUILD_VPR_<variant>`` token (e.g. ``"net_vlad"``);
|
||||
# ``descriptor_dim`` is the strategy's stable embedding dim;
|
||||
# ``latency_us`` is the strategy-internal ``Clock.monotonic_ns``
|
||||
# delta around the backbone forward, in integer microseconds.
|
||||
"vpr.embed_query": frozenset(
|
||||
{
|
||||
"frame_id",
|
||||
"backbone_label",
|
||||
"descriptor_dim",
|
||||
"latency_us",
|
||||
}
|
||||
),
|
||||
# AZ-338 / E-C2: emitted when ``embed_query`` raises a
|
||||
# :class:`VprBackboneError` (forward-pass failure: CUDA OOM, TRT
|
||||
# engine deserialize mismatch, etc.). One record per occurrence;
|
||||
# logged at ERROR alongside.
|
||||
"vpr.backbone_error": frozenset(
|
||||
{
|
||||
"frame_id",
|
||||
"backbone_label",
|
||||
"error_type",
|
||||
"error_message",
|
||||
}
|
||||
),
|
||||
# AZ-338 / E-C2: emitted when ``embed_query`` raises a
|
||||
# :class:`VprPreprocessError` (image decode / shape / dtype
|
||||
# violation in the per-strategy preprocessor).
|
||||
"vpr.preprocess_error": frozenset(
|
||||
{
|
||||
"frame_id",
|
||||
"backbone_label",
|
||||
"error_type",
|
||||
"error_message",
|
||||
}
|
||||
),
|
||||
}
|
||||
|
||||
KNOWN_KINDS: Final[frozenset[str]] = frozenset(KNOWN_PAYLOAD_KEYS.keys())
|
||||
|
||||
@@ -92,6 +92,59 @@ class DescriptorNormaliser:
|
||||
normalised_f32 = np.where(norms == 0.0, 0.0, as_f32 / safe)
|
||||
return normalised_f32.astype(in_dtype, copy=False)
|
||||
|
||||
@staticmethod
|
||||
def intra_cluster_normalise(
|
||||
descriptor: np.ndarray, num_clusters: int
|
||||
) -> np.ndarray:
|
||||
"""Per-cluster L2 normalisation for VLAD-aggregated descriptors (AZ-338).
|
||||
|
||||
NetVLAD's published preprocessing chain L2-normalises each
|
||||
per-cluster sub-vector BEFORE the global L2 step. The input is
|
||||
a flat 1-D VLAD descriptor of shape ``(num_clusters * cluster_dim,)``
|
||||
which is reshaped to ``(num_clusters, cluster_dim)``, normalised
|
||||
row-wise, then flattened back. ``num_clusters`` must divide
|
||||
``descriptor.shape[0]``.
|
||||
|
||||
Zero-norm sub-vectors are returned as zero (consistent with
|
||||
:meth:`l2_normalise`).
|
||||
"""
|
||||
if not isinstance(descriptor, np.ndarray):
|
||||
raise DescriptorNormaliserError(
|
||||
f"intra_cluster_normalise: expected np.ndarray; "
|
||||
f"got {type(descriptor).__name__}"
|
||||
)
|
||||
if descriptor.ndim != 1:
|
||||
raise DescriptorNormaliserError(
|
||||
f"intra_cluster_normalise: expected 1-D shape (K*D,); "
|
||||
f"got shape {descriptor.shape}"
|
||||
)
|
||||
if not isinstance(num_clusters, int) or isinstance(num_clusters, bool):
|
||||
raise DescriptorNormaliserError(
|
||||
f"intra_cluster_normalise: num_clusters must be a non-bool "
|
||||
f"int; got {num_clusters!r}"
|
||||
)
|
||||
if num_clusters < 1:
|
||||
raise DescriptorNormaliserError(
|
||||
f"intra_cluster_normalise: num_clusters must be >= 1; "
|
||||
f"got {num_clusters}"
|
||||
)
|
||||
total_dim = descriptor.shape[0]
|
||||
if total_dim % num_clusters != 0:
|
||||
raise DescriptorNormaliserError(
|
||||
f"intra_cluster_normalise: descriptor length {total_dim} "
|
||||
f"not divisible by num_clusters={num_clusters}"
|
||||
)
|
||||
_validate_dtype(descriptor, "intra_cluster_normalise")
|
||||
in_dtype = descriptor.dtype
|
||||
cluster_dim = total_dim // num_clusters
|
||||
reshaped = descriptor.reshape(num_clusters, cluster_dim).astype(
|
||||
np.float32, copy=False
|
||||
)
|
||||
norms = np.linalg.norm(reshaped, axis=1, keepdims=True)
|
||||
safe = np.where(norms == 0.0, 1.0, norms)
|
||||
normalised = np.where(norms == 0.0, 0.0, reshaped / safe)
|
||||
return normalised.reshape(total_dim).astype(in_dtype, copy=False)
|
||||
|
||||
@staticmethod
|
||||
def descriptor_metric() -> str:
|
||||
return _METRIC_VALUE
|
||||
|
||||
@@ -103,7 +103,7 @@ def _is_build_flag_on(flag_name: str) -> bool:
|
||||
return raw.strip().lower() in {"on", "1", "true", "yes"}
|
||||
|
||||
|
||||
def _c2_config(config: "Config") -> "C2VprConfig":
|
||||
def _c2_config(config: Config) -> C2VprConfig:
|
||||
"""Pull the registered C2 config block.
|
||||
|
||||
``c2_vpr.__init__`` registers it on import; a missing
|
||||
@@ -113,34 +113,72 @@ def _c2_config(config: "Config") -> "C2VprConfig":
|
||||
return config.components["c2_vpr"]
|
||||
|
||||
|
||||
def _register_strategy_architecture(
|
||||
strategy: str, config: Config
|
||||
) -> None:
|
||||
"""Register the strategy's C7 architecture factory (AZ-338 + future).
|
||||
|
||||
Each concrete strategy module owns its NN architecture but cannot
|
||||
import C7 directly (AZ-507). The strategy module exposes
|
||||
``MODEL_NAME`` + ``architecture_factory(descriptor_dim)`` and the
|
||||
composition root performs the registration with c7. Strategies that
|
||||
do not need a registered architecture (e.g. TRT-engine strategies
|
||||
that bring their own ``.engine``) MAY omit these attributes; the
|
||||
helper no-ops in that case.
|
||||
"""
|
||||
module_info = _STRATEGY_TO_MODULE.get(strategy)
|
||||
if module_info is None:
|
||||
return
|
||||
module_name, _ = module_info
|
||||
try:
|
||||
module = __import__(module_name, fromlist=["architecture_factory"])
|
||||
except ModuleNotFoundError:
|
||||
return
|
||||
factory_fn = getattr(module, "architecture_factory", None)
|
||||
model_name = getattr(module, "MODEL_NAME", None)
|
||||
if factory_fn is None or model_name is None:
|
||||
return
|
||||
if strategy == "net_vlad":
|
||||
descriptor_dim = config.components["c2_vpr"].netvlad_descriptor_dim
|
||||
else:
|
||||
# Future strategies: each may have its own descriptor_dim source;
|
||||
# extend as they land.
|
||||
return
|
||||
from gps_denied_onboard.components.c7_inference import register_architecture
|
||||
|
||||
register_architecture(model_name, factory_fn(descriptor_dim))
|
||||
|
||||
|
||||
def build_vpr_strategy(
|
||||
config: "Config",
|
||||
config: Config,
|
||||
*,
|
||||
descriptor_index: "DescriptorIndex",
|
||||
inference_runtime: "InferenceRuntime",
|
||||
) -> "VprStrategy":
|
||||
descriptor_index: DescriptorIndex,
|
||||
inference_runtime: InferenceRuntime,
|
||||
) -> VprStrategy:
|
||||
"""Construct the :class:`VprStrategy` impl selected by config.
|
||||
|
||||
1. Reads ``config.components['c2_vpr'].strategy``.
|
||||
2. Checks the matching ``BUILD_VPR_<variant>`` flag — if OFF,
|
||||
raises :class:`StrategyNotAvailableError` BEFORE any import.
|
||||
3. Lazily imports the concrete strategy module.
|
||||
4. Constructs the strategy via its module-level
|
||||
4. Registers the strategy's NN architecture with C7 (when the
|
||||
strategy module exposes ``MODEL_NAME`` + ``architecture_factory``).
|
||||
5. Constructs the strategy via its module-level
|
||||
``create(config, descriptor_index, inference_runtime)``
|
||||
factory function (each concrete strategy module exports
|
||||
``create`` as its public entry-point; concrete constructors
|
||||
stay private).
|
||||
5. Pre-flight ``descriptor_dim`` match: ``strategy.descriptor_dim()``
|
||||
6. Pre-flight ``descriptor_dim`` match: ``strategy.descriptor_dim()``
|
||||
vs ``descriptor_index.descriptor_dim()``. Mismatch raises
|
||||
:class:`ConfigError`; ONE ERROR log
|
||||
``kind="c2.vpr.dim_mismatch"`` is emitted; the strategy is
|
||||
NOT bound.
|
||||
6. On success, ONE INFO log ``kind="c2.vpr.strategy_loaded"``
|
||||
7. On success, ONE INFO log ``kind="c2.vpr.strategy_loaded"``
|
||||
with ``strategy`` + ``descriptor_dim``.
|
||||
|
||||
Raises:
|
||||
StrategyNotAvailableError: compile-time flag OFF or
|
||||
concrete module not yet built (AZ-337..AZ-340 pending).
|
||||
concrete module not yet built (AZ-337 / AZ-339 / AZ-340 pending).
|
||||
ConfigError: ``descriptor_dim`` mismatch between strategy
|
||||
and corpus index.
|
||||
"""
|
||||
@@ -176,8 +214,9 @@ def build_vpr_strategy(
|
||||
raise StrategyNotAvailableError(
|
||||
f"VprStrategy {strategy!r} is configured but its concrete impl "
|
||||
f"module {module_name!r} has not been built into this binary "
|
||||
"yet (AZ-337 / AZ-338 / AZ-339 / AZ-340 pending)."
|
||||
"yet (AZ-337 / AZ-339 / AZ-340 pending)."
|
||||
) from exc
|
||||
_register_strategy_architecture(strategy, config)
|
||||
create_fn = getattr(module, "create", None)
|
||||
if create_fn is None:
|
||||
strategy_cls = getattr(module, class_name)
|
||||
|
||||
Reference in New Issue
Block a user