NetVLAD is the C2 comparative baseline per the engine rule (every production-default backbone ships with a simple-baseline alongside). Runs on the C7 PyTorch FP16 runtime (NOT TRT) so a TRT engine compile bug cannot simultaneously break NetVLAD AND UltraVPR. Production changes: - c2_vpr/net_vlad.py — NetVladStrategy + module-level create() factory. Constructor wires InferenceRuntimeCut + DescriptorIndexCut + NetVladBackbonePreprocessor + DescriptorNormaliser + FaissBridge. embed_query pipeline: preprocess -> runtime.infer -> dual-stage normalisation (intra-cluster THEN global L2) -> VprQuery. retrieve_topk delegates one-line to FaissBridge. - c2_vpr/_net_vlad_architecture.py — Arandjelovic et al. 2016 NetVLAD layer over torchvision VGG16 features + optional Linear PCA projection to descriptor_dim (default 4096; published Pittsburgh reference uses K*D=64*512=32768 raw + Linear(32768, 4096) PCA). - c2_vpr/_preprocessor_net_vlad.py — OpenCV-based image preprocessor: decode -> centre-crop square -> resize (480, 480) -> ImageNet normalisation -> FP16 NCHW. Calibration is not consumed (NetVLAD is calibration-agnostic per published preprocessing chain). - c2_vpr/inference_runtime_cut.py — NEW AZ-507 consumer-side cut mirroring C7 InferenceRuntime; lets c2_vpr stay AZ-507-clean. - c2_vpr/config.py — added netvlad_descriptor_dim: int = 4096 knob. - helpers/descriptor_normaliser.py — added intra_cluster_normalise (DescriptorNormaliser v1.0.0 -> v1.1.0; backward-compatible add). - runtime_root/vpr_factory.py — added _register_strategy_architecture helper that binds (MODEL_NAME, architecture_factory(descriptor_dim)) to C7's architecture registry before delegating to the strategy's create() factory. Keeps the c7 import at L4, preserves AZ-507. - fdr_client/records.py — registered vpr.embed_query, vpr.backbone_error, vpr.preprocess_error record kinds. Tests: - tests/unit/c2_vpr/test_net_vlad.py — 31 tests covering all 11 ACs + preprocessor contract + architecture factory + constructor validation + FDR record emission. - tests/unit/test_az283_descriptor_normaliser.py — +8 tests for the new intra_cluster_normalise. - tests/unit/test_az272_fdr_record_schema.py — +3 fixture payloads. Full unit suite: 1608 passed / 80 env-skipped (+43 new tests). Per-batch code review (batch_46_review.md): PASS_WITH_WARNINGS (4 Low-severity hygiene findings; no Critical/High/Medium). Architectural notes: - The spec implied c2_vpr.net_vlad.create() registers the architecture with C7. That violates AZ-507 (no cross-component imports). Resolved by exposing MODEL_NAME + architecture_factory(descriptor_dim) on the strategy module and having the composition root perform the C7 bind. - C7 PyTorch runtime API names in the spec (forward, load_engine) were outdated; aligned implementation with the live v1.0.0 Protocol (infer, compile_engine + deserialize_engine). Spec hygiene flagged in review F2. Co-authored-by: Cursor <cursoragent@cursor.com>
14 KiB
Code Review — Batch 46 / AZ-338 (C2 NetVLAD Mandatory Simple-Baseline)
Date: 2026-05-13 Mode: Per-batch (all 7 phases) Task: AZ-338 — C2 NetVLAD Mandatory Simple-Baseline (3pt) Verdict: PASS_WITH_WARNINGS
Scope
| Domain | Files |
|---|---|
| c2_vpr (production) | net_vlad.py (NEW), _net_vlad_architecture.py (NEW), _preprocessor_net_vlad.py (NEW), inference_runtime_cut.py (NEW — AZ-507 cut of C7 InferenceRuntime), config.py (added netvlad_descriptor_dim: int = 4096), __init__.py (re-exports InferenceRuntimeCut) |
| Shared helpers | helpers/descriptor_normaliser.py (added intra_cluster_normalise(descriptor, num_clusters) — backward-compatible v1.1.0) |
| FDR | fdr_client/records.py (registered vpr.embed_query, vpr.backbone_error, vpr.preprocess_error per the AZ-338 spec § Outcome) |
| Composition root | runtime_root/vpr_factory.py (added _register_strategy_architecture helper; calls C7 register_architecture for the strategy's MODEL_NAME + architecture_factory pair before delegating to create()) |
| Tests | tests/unit/c2_vpr/test_net_vlad.py (NEW, 31 tests), tests/unit/test_az283_descriptor_normaliser.py (+8 tests for the new method), tests/unit/test_az272_fdr_record_schema.py (+3 fixture payloads) |
| Docs | _docs/02_document/contracts/shared_helpers/descriptor_normaliser.md (v1.0.0 → v1.1.0; documented intra_cluster_normalise row + changelog entry) |
Phase 1 — Context Loading
Inputs reviewed:
- AZ-338 spec (
_docs/02_tasks/todo/AZ-338_c2_net_vlad.md). vpr_strategy_protocol.mdv1.0.0 — 7 invariants; INV-3 (L2-normalised embedding) is the central correctness contract.c2_vpr/_faiss_bridge.py(AZ-341, prior batch) — the strategy's one-line retrieve delegation target.c7_inference/pytorch_fp16_runtime.py(AZ-300) — the runtime that actually deserializes the registered NetVLAD architecture.c7_inference/architecture_registry.py— the registration target; rejects re-registration with a different factory under the same key (defensive against accidental collision).- AZ-507 lint rule (
tests/unit/test_az270_compose_root.py::test_ac6_only_compose_root_imports_concrete_strategies) — components MAY NOT import other components. _types/inference.py—BuildConfig,EngineCacheEntry,EngineHandle,PrecisionMode(L1 shared DTOs the strategy uses).
Phase 2 — Spec Compliance
All 11 ACs satisfied:
| AC | Description | Covering test(s) |
|---|---|---|
| AC-1 | Protocol conformance | test_ac1_protocol_conformance |
| AC-2 | L2-norm == 1.0 ± 1e-3 FP16 (D,) | test_ac2_embed_query_returns_unit_norm_fp16_descriptor + 512-PCA variant |
| AC-3 | intra_cluster_normalise BEFORE l2_normalise |
test_ac3_intra_cluster_called_before_global_l2 + once-each |
| AC-4 | Deterministic across 3 calls | test_ac4_embed_query_deterministic_for_same_frame |
| AC-5 | retrieve_topk == k, label="net_vlad", sorted |
test_ac5_retrieve_topk_returns_exactly_k_with_net_vlad_label |
| AC-6 | descriptor_dim() stable |
4096 + 512 instance variants |
| AC-7 | Engine output shape mismatch → ConfigError | test_ac7_create_rejects_engine_output_shape_mismatch |
| AC-8 | VprBackboneError on forward failure |
RuntimeError + missing-key + wrong-shape variants |
| AC-9 | VprPreprocessError on corrupt image |
non-array + wrong-dtype + wrong-shape variants |
| AC-10 | Composition-root wiring + c2.vpr.ready log |
INFO log + model_name forcing |
| AC-11 | BUILD_PYTORCH_RUNTIME=OFF → ConfigError fail-fast |
tensorrt + onnx_trt_ep runtime label variants |
Spec deviations:
flask runtime.forward(engine_id, ...)→runtime.infer(handle, ...): the spec used placeholder names; the actual C7InferenceRuntimeProtocol API isinfer(handle, inputs)+compile_engine+deserialize_engine. Aligned with the live Protocol shape (AZ-297). Flag: spec wording should be refreshed to match the c7 contract.- Architecture registration moved from
c2_vpr.net_vlad.create()toruntime_root/vpr_factory.py::_register_strategy_architecture: the spec implies the strategy'screate(...)registers the architecture with C7. That violates AZ-507 (c2_vpr cannot import c7_inference). Resolved by exposingMODEL_NAME+architecture_factory(descriptor_dim)on the strategy module and having the composition root perform the c7 binding before callingcreate(...). The C7-sideregister_architecturecall lives at L4 (runtime_root), not L3. This is a design improvement over the spec; the spec should be updated. NetVladStrategy.__init__signature: differs from the spec's positional argument list (the spec listsruntime, tile_store, weights_path, preprocessor, normaliser, fdr_client, descriptor_dim). Implemented as keyword-only withengine_handle(returned fromdeserialize_engine) replacingweights_path(the strategy holds the resolved handle, not the source path — per the spec's own "holds the engine ID, NOT the engine itself" constraint, more consistent). Thetile_storefield also got renameddescriptor_indexto matchDescriptorIndexCut(AZ-507 cut).
Aligning the spec with the implementation is in the Findings below (see F2).
Phase 3 — Code Quality
- Every function ≤ ~50 LOC except
make_net_vlad_vgg16(~75 LOC of which 60 is innernn.Moduledefinitions — natural, indivisible). - No bare
except; every error chain usesraise ... from exc. - No silently-swallowed errors; the strategy emits ERROR logs + an FDR
record for both
VprBackboneErrorandVprPreprocessErrorpaths. - Constructor validation is consistent:
ValueErrorfor range/shape violations,TypeErrorfor type violations (matches the pattern of the prior batch'sFaissBridge). - The
_iso_ts_from_clockhelper is duplicated yet again — sixth module-local copy (see F1 below; carried-over from cumulative review 43-45). - Class names (
NetVladStrategy,NetVladBackbonePreprocessor) match the spec. - No verbose default-on debug logging; logs are scoped to ERROR-on-error
- one INFO
c2.vpr.readyat composition time.
- one INFO
- Ruff clean on every new file (UP037 auto-fixes applied; one RUF002
ambiguous-glyph in
_net_vlad_architecture.pydocstring fixed in Phase F).
Phase 4 — Security Quick-Scan
- No SQL injection / command injection / eval / exec.
- No hardcoded secrets.
- FDR error-message payload is bounded to
str(error)[:512]— prevents unbounded sensitive-data exfiltration via long exception messages. - No PII;
vpr.embed_querypayload is(frame_id, backbone_label, descriptor_dim, latency_us)— all operational metadata. - The
intra_cluster_normalisehelper rejects float64 input — denies upcasts that would silently break the FAISS metric. - The
c7_inference.register_architecturecall lives in the composition root which runs at startup; not reachable from user-controlled input.
Phase 5 — Performance Scan
embed_queryp95 ≤ 80ms NFR — not verified by microbench in this batch (deferred to C2-IT-01 / FT-P-19, Step 9). Justification: microbench requires real PyTorch CUDA + real NetVLAD weights; the current Tier-1 host has neither.retrieve_topkp95 ≤ 4ms — theFaissBridge(AZ-341) already carries the p95 ≤ 500µs microbench; this strategy is a single-line delegation, no added overhead.- The architecture's NetVLAD pooling layer uses
torch.bmmfor the K-cluster reduction instead of a Python loop — single optimised CUDA kernel call. The published reference impl from Pittsburgh has a Pythonfor k in range(K)loop; this batched form is asymptotically equivalent (K ~ 64) and dramatically faster on GPU. - The dual-stage normalisation is two FP32-on-FP16-input operations, ~ 4096-element working set — sub-µs on any host.
Phase 6 — Cross-Task Consistency
NetVLAD is the first concrete VprStrategy implementation. Cross-task consistency therefore concerns the patterns it establishes for AZ-337 (UltraVPR), AZ-339 (MegaLoc/MixVPR), AZ-340 (SelaVPR/EigenPlaces/SALAD):
- AZ-507 cut pattern:
InferenceRuntimeCutjoinsDescriptorIndexCut(AZ-341),TileUploaderCut(AZ-329),TileDownloaderCut(AZ-328). Five Protocol cuts now exist cross-component; all named*Cut; allruntime_checkable=True; all one Protocol per file; all consumed via the consumer-side cut module path. Pattern is stable. - Architecture-registration split: the strategy module exposes
MODEL_NAME+architecture_factory(descriptor_dim); the composition root performs the c7 registration. Future C2 strategies using the PyTorch runtime (AZ-339 MegaLoc/MixVPR with VGG/ResNet backbones; AZ-340 SelaVPR/EigenPlaces/SALAD with various backbones) follow the same shape; the composition-root helper_register_strategy_architecturealready has the dispatch slot for per-strategydescriptor_dimlookup. - Dual-stage normalisation: NetVLAD's
intra_cluster_normalisel2_normalisechain is unique to NetVLAD (UltraVPR uses single-stagel2_normaliseper the AZ-337 spec). The helper addition toDescriptorNormaliseris therefore NetVLAD-specific by invocation but architectural-pattern-neutral by API; future VLAD-aggregating strategies (SALAD has VLAD-like aggregation) can reuse the same helper.
- FDR record kinds:
vpr.embed_query/vpr.backbone_error/vpr.preprocess_errorare strategy-generic; every concrete C2 strategy emits the same three plus the AZ-341vpr.retrieve_topkfrom the bridge.
Phase 7 — Architecture Compliance
- Layer direction (rule 1): no upward imports. The strategy module
imports
_types,clock,config,fdr_client,helpers,logging, and its sibling c2_vpr modules — all at or below L3. - Public API respect / AZ-507 (rule 2): verified by the
test_ac6_only_compose_root_imports_concrete_strategieslint: PASS.c2_vpr/net_vlad.pyconsumesInferenceRuntimeCut(defined in c2_vpr) instead of importingc7_inference.InferenceRuntime. - No new cyclic dependencies (rule 3): no new cycles.
- Duplicate symbols (rule 4):
_iso_ts_from_clocknow in 6 modules (carry-over F1, AZ-508 covers consolidation). No new duplications introduced. - Cross-cutting concerns not locally re-implemented (rule 5): the composition root owns the c7 architecture registration; the c2_vpr factory does not.
Findings
| # | Severity | Category | Files | Title |
|---|---|---|---|---|
| F1 | Low | Maintainability | c2_vpr/net_vlad.py |
_iso_ts_from_clock duplicated (6th module-local copy) |
| F2 | Low | Spec-Hygiene | AZ-338 task spec | Spec § Outcome lists outdated C7 API names (runtime.forward vs infer; runtime.load_engine vs compile_engine + deserialize_engine) + architecture-registration location |
| F3 | Low | Test-Coverage | tests/unit/c2_vpr/test_net_vlad.py |
NFR-perf microbench (p95 ≤ 80ms) deferred (no Tier-1 PyTorch CUDA host); flagged in Phase 5 |
| F4 | Low | Architecture | _net_vlad_architecture.py |
NetVLAD's PCA-projection layer parameters are part of the loaded .pth state dict; weights validation that the PCA centroids match the recorded sidecar is deferred to AZ-280 (engine sidecar) integration |
Finding Details
F1: _iso_ts_from_clock duplicated (6th copy) (Low / Maintainability)
- Location:
src/gps_denied_onboard/components/c2_vpr/net_vlad.pymodule-level function. - Description: same 6-line helper as
c2_vpr/_faiss_bridge.py,c12_operator_orchestrator/operator_reloc_service.py,c11_tile_manager/idempotent_retry.py,c11_tile_manager/signing_key.py,c6_tile_cache/postgres_filesystem_store.py,c6_tile_cache/freshness_gate.py— six modules now. - Suggestion: AZ-508 (hygiene PBI for ISO-timestamp consolidation) is
already in
todo/and scoped to absorb all six call-sites.
F2: AZ-338 spec uses outdated C7 API names + architecture-registration location (Low / Spec-Hygiene)
- Locations:
- Spec § Outcome:
intermediate = self._runtime.forward(self._engine_id, {"input": tensor})→ live API isself._runtime.infer(self._engine_handle, {"input": tensor}). - Spec § Outcome:
inference_runtime.load_engine(weights_path)→ live API iscompile_engine(model_path, build_config) -> entry; deserialize_engine(entry) -> handle. - Spec § Outcome implies
create(...)performs the C7 architecture registration; AZ-507 forbids this. Resolved by moving the registration toruntime_root/vpr_factory.py::_register_strategy_architecture.
- Spec § Outcome:
- Description: the spec was written against an earlier C7 Protocol draft; the C7 Protocol stabilised at v1.0.0 in AZ-297. The implementation aligns with the v1.0.0 Protocol; the spec is now stale on this detail.
- Suggestion: surface to user as a small spec-hygiene follow-up.
Same class of finding as cumulative review F3 (AZ-341 spec
listed an unused
normaliserparameter). Recommend a single hygiene PBI scoped to "refresh AZ-337..AZ-340 specs against the stabilised C7 v1.0.0 + AZ-507 patterns".
F3: NFR-perf microbench deferred (no Tier-1 PyTorch CUDA host) (Low / Test-Coverage)
- Location: tests/unit/c2_vpr/test_net_vlad.py (no microbench test class for AZ-338 NFR-perf).
- Description: the AZ-338 spec NFRs cite p95 ≤ 80ms for
embed_queryon Tier-1 Jetson Orin. Microbench requires real PyTorch CUDA + real NetVLAD weights; not runnable on this Tier-0 dev host (macOS, no CUDA). The fakeInferenceRuntimereturns a synthetic output and therefore cannot probe real-runtime latency. - Suggestion: schedule under FT-P-19 / C2-IT-01 (Step 9 / E-BBT) on Tier-1 hardware. No action this batch.
F4: PCA-projection sidecar verification deferred (Low / Architecture)
- Location:
src/gps_denied_onboard/components/c2_vpr/_net_vlad_architecture.pyPCAnn.Linear(K*D, descriptor_dim). - Description: the architecture loads its PCA-projection layer's
weights from the same
.pthstate dict as the rest of the model viatorch.load + load_state_dict(strict=True). There is no separate check that the PCA centroids + whitening matrix match the sha256 sidecar (AZ-280). For now the deserialize-time strict-mode check is the only safeguard. - Suggestion: schedule under a future "C2 PCA-whitening sidecar validation" PBI if FT-P-19 / C2-IT-01 reveals real-world drift. No action this batch.
Verdict
PASS_WITH_WARNINGS — 4 Low-severity findings, all hygiene / deferred-validation. No Critical, no High, no Medium. AC coverage is complete; full unit suite is green (1608 passed / 80 env-skipped, +43 tests over batch 45).