Files
Oleksandr Bezdieniezhnykh 3c4fd272f1 [AZ-337] C2 UltraVPR primary backbone VprStrategy
UltraVPR is the Documentary Lead's PRIMARY backbone per
description.md § 1 and is wired by default
(config.c2_vpr.strategy = "ultra_vpr"). Runs on the C7 TensorRT
runtime (AZ-298) or ONNX-Runtime fallback (AZ-299); explicitly NOT
on the PyTorch FP16 runtime so a TRT engine compile bug can fall
back to NetVLAD without simultaneously breaking both strategies.

Production changes:
- c2_vpr/ultra_vpr.py - UltraVprStrategy + module-level create()
  factory. embed_query pipeline: preprocess -> runtime.infer ->
  single-stage L2 -> VprQuery. retrieve_topk delegates one-line to
  FaissBridge. Engine load + output-shape assertion happen at
  create() time (AC-6) so misconfiguration surfaces at startup,
  not 17 minutes into a flight. UltraVPR has D=512 fixed (NOT a
  config knob; AC-5 / AC-6 / AC-7 all assume 512). Single-stage L2
  (no intra-cluster step like NetVLAD; spy-test enforces this so a
  future refactor cannot silently regress recall).
- c2_vpr/_preprocessor_ultra_vpr.py - centre-crop using the camera
  calibration's principal point (cx, cy from intrinsics_3x3),
  falling back to geometric centre + WARN log when calibration is
  absent (AC-9). Resize -> (384, 384) -> ImageNet mean/std ->
  FP16 NCHW.
- No composition-root changes: UltraVPR consumes a pre-compiled
  .trt engine (no PyTorch nn.Module), so the strategy module does
  NOT expose MODEL_NAME / architecture_factory. The composition-
  root _register_strategy_architecture helper no-ops cleanly for
  this case (verified by test_create_does_not_register_pytorch_architecture).

Tests:
- tests/unit/c2_vpr/test_ultra_vpr.py - 29 tests covering all 12
  ACs + preprocessor contract + constructor validation + FDR
  record emission + single-stage L2 enforcement.

Full unit suite: 1637 passed / 80 env-skipped (+29 new tests).
Per-batch code review (batch_47_review.md): PASS_WITH_WARNINGS
(3 Low-severity findings; no Critical / High / Medium):
- F1: _iso_ts_from_clock is now the 7th copy (AZ-508 will close).
- F2: AZ-337 spec uses outdated C7 API names; affects upcoming
  AZ-339 / AZ-340. Spec-hygiene PBI recommended.
- F3: principal-point fallback uses (0, 0) zero-detection for
  missing calibration; safe but tightens when intrinsics become
  Optional.

Architectural notes:
- AZ-507 layering clean. Imports only InferenceRuntimeCut,
  DescriptorIndexCut, c2_vpr internals, _types, helpers,
  clock, fdr_client. Architecture lint test passes.
- Pattern parity with NetVLAD (B46) where semantics permit;
  UltraVPR-specific paths (single-stage L2, 'embedding' output
  key, TRT runtime, no architecture registry, principal-point
  crop) are clearly localised.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 22:43:17 +03:00

8.7 KiB

Batch 47 / Cycle 1 — Per-Batch Code Review

Date: 2026-05-13 Tasks: AZ-337 — C2 UltraVPR Primary Backbone (5pt) Reviewer: autodev orchestrator (inline review) Verdict: PASS_WITH_WARNINGS — three Low-severity findings, none blocking.

Scope

  • src/gps_denied_onboard/components/c2_vpr/ultra_vpr.py (new, 372 lines)
  • src/gps_denied_onboard/components/c2_vpr/_preprocessor_ultra_vpr.py (new, 188 lines)
  • tests/unit/c2_vpr/test_ultra_vpr.py (new, 581 lines, 29 tests)
  • _docs/_autodev_state.md (state bump)

No modifications to existing production files. The composition-root helper _register_strategy_architecture already no-ops cleanly for strategies that do not expose MODEL_NAME / architecture_factory — exactly the design path AZ-337 takes.

Acceptance-Criteria Coverage Verification

12/12 ACs have at least one covering test:

AC Test(s)
AC-1 (Protocol conformance) test_ac1_protocol_conformance
AC-2 (L2-norm FP16 (512,)) test_ac2_embed_query_returns_unit_norm_fp16_512, test_ac2_embedding_is_single_stage_l2_no_intra_cluster_path
AC-3 (deterministic) test_ac3_embed_query_deterministic_for_same_frame
AC-4 (retrieve_topk == k, sorted, label) test_ac4_retrieve_topk_returns_exactly_k_with_ultra_vpr_label
AC-5 (descriptor_dim() stable returns 512) test_ac5_descriptor_dim_stable_returns_512
AC-6 (engine output shape mismatch → ConfigError) test_ac6_create_rejects_engine_output_shape_mismatch, test_ac6_create_rejects_engine_with_missing_embedding_key
AC-7 (VprBackboneError on forward fail + ERROR log + FDR) test_ac7_runtime_error_yields_vpr_backbone_error, test_ac7_missing_embedding_key_yields_vpr_backbone_error, test_ac7_wrong_forward_output_shape_yields_vpr_backbone_error
AC-8 (VprPreprocessError on corrupt image + ERROR log + FDR) test_ac8_corrupt_image_yields_vpr_preprocess_error, test_ac8_wrong_dtype_image_yields_vpr_preprocess_error
AC-9 (calibration absent → geometric centre + WARN) test_ac9_identity_calibration_falls_back_to_geometric_centre, test_ac9_principal_point_offset_changes_crop_window
AC-10 (IndexUnavailableError re-raised unchanged) test_ac10_index_unavailable_propagates_unchanged
AC-11 (composition-root wiring + BUILD-flag gate) test_ac11_create_emits_strategy_ready_info_log, test_ac11_non_trt_runtime_rejected_at_create, test_ac11_onnx_trt_ep_runtime_accepted_at_create
AC-12 (top-1 > threshold → WARN via FaissBridge) test_ac12_top1_above_threshold_emits_warn_via_faiss_bridge

Tests

  • tests/unit/c2_vpr/test_ultra_vpr.py: 29 / 29 PASS in 2.5s.
  • Full unit suite: 1637 passed / 80 skipped / 0 failed in ~66s. Up from 1608 at the close of Batch 46 (+29 new tests).
  • ruff check on all new + modified files: clean.

Architectural Review

F1 — _iso_ts_from_clock is now the 7th copy (Low / Maintainability, carried)

_iso_ts_from_clock appears verbatim in ultra_vpr.py lines 296-303, matching the same helper in net_vlad.py, _faiss_bridge.py, c11_*, c12_*, c6_tile_cache.postgres_filesystem_store, and c6_tile_cache.freshness_gate. This is the 7th identical copy. Already tracked by AZ-508 ("ISO timestamp consolidation, 2pt"). Recommend prioritising AZ-508 before the remaining C2 strategies (AZ-339, AZ-340) add copies #8 and #9.

F2 — Spec→implementation drift on C7 API names (Low / Spec-Hygiene)

The AZ-337 spec § Outcome uses outdated C7 API names:

  • Spec: runtime.forward(engine_id, {...})["embedding"]
  • Live: runtime.infer(handle, {...}) returning dict[str, ndarray]
  • Spec: runtime.load_engine(weights_path)
  • Live: runtime.compile_engine(model_path, build_config)EngineCacheEntry, then deserialize_engine(entry)EngineHandle

Same drift was flagged in Batch 46 (AZ-338) review as F2. The implementation aligns with the live v1.0.0 Protocol (AZ-297); spec text is stale. Recommendation: a spec-hygiene PBI for AZ-339 / AZ-340 / AZ-358 / AZ-349 to refresh references to the v1.0.0 C7 Protocol BEFORE those tasks are picked up — otherwise each batch repeats the same "spec said X, code does Y" review note.

F3 — Principal-point fallback heuristic relies on identity-matrix detection (Low / Test-Robustness)

UltraVprBackbonePreprocessor._extract_principal_point treats (cx, cy) == (0, 0) as "no calibration data" because the test fixture uses np.eye(3) for "missing" calibration. A real camera calibration with a genuine principal point near the top-left (unusual but legal for cropped sensors) would also be skipped to geometric-centre fallback. The heuristic is correct for production (intrinsics zeroed → no calibration) but the test fixture trick would benefit from a None sentinel or a flag rather than relying on the zero-equality check. Risk is bounded: no real camera has cx == 0 and cy == 0; the worst-case is a one-frame mis-crop with a graceful WARN log. Not blocking. Recommendation: when a real intrinsics_3x3 == None path lands (currently the dataclass field is Any not Optional), tighten the type annotation and remove the zero-detection branch.

Architecture Notes (Strengths)

  1. AZ-507 layering clean. UltraVPR consumes InferenceRuntimeCut + DescriptorIndexCut (both C2-owned structural cuts), never components.c7_inference or components.c6_tile_cache directly. Architecture lint test test_ac6_only_compose_root_imports_concrete_strategies PASS.
  2. No PyTorch architecture registration. UltraVPR is the first strategy that does NOT register a NN architecture (TRT-engine-only, no PyTorch fallback). Verified by test_create_does_not_register_pytorch_architecture. The composition-root _register_strategy_architecture helper no-ops cleanly for this case — no code change needed there.
  3. Engine load + output-shape assertion at create time. Failure surfaces at composition time, NOT at first frame. Matches Constraint § 5 of the task spec.
  4. Single-stage L2 normalisation. Explicitly verified to NOT call intra_cluster_normalise (NetVLAD's two-stage path). This is a regression-blocking spy in test_ac2_embedding_is_single_stage_l2_no_intra_cluster_path — if a future refactor accidentally adds the intra-cluster step, recall would silently degrade on the Derkachi corpus.
  5. Constructor-injection only. No import of gps_denied_onboard.config inside ultra_vpr.py; config is consumed exclusively through the create() factory parameter.
  6. Pattern parity with NetVLAD. embed_query / retrieve_topk / _emit_* shape mirrors NetVladStrategy line-for-line where semantics permit; UltraVPR-specific paths (single-stage L2, "embedding" output key, TRT runtime, no architecture registry) are clearly localised.
  7. AC-12 delegation. FaissBridge already owns the top-1-distance WARN log; UltraVPR inherits this for free via the same delegation that NetVLAD uses — one production touchpoint, two strategies. Confirmed by direct test.
  8. Calibration consumption matches spec. The principal-point crop is the documented UltraVPR preprocessing; the geometric- centre fallback path with WARN log satisfies AC-9 exactly.

Performance

Performance NFR (C2-PT-01 embed_query p95 ≤ 60 ms on Tier-1 Jetson Orin) is deferred to E-BBT per task spec § NFRs. Macbook dev tier has no TRT 10.3 + Jetson Orin to benchmark against. Carry-over to the next cumulative review: F3 from the 43-45 cumulative report already tracks "Tier-1 perf microbenchmarks deferred"; AZ-337 adds to that backlog.

Comparison vs Batch 46 (AZ-338)

Aspect NetVLAD (B46) UltraVPR (B47)
Runtime label pytorch_fp16 tensorrt / onnx_trt_ep
Engine input .pth state dict .trt engine file
Architecture registry binds factory no-op
Descriptor dim 4096 (configurable PCA) 512 (fixed)
Normalisation intra-cluster THEN L2 L2 only
Output key vlad_descriptor embedding
Input shape (480, 480) (384, 384)
Calibration use ignored principal-point crop
Test count 31 29

Verdict

PASS_WITH_WARNINGS. Three Low-severity findings, none blocking:

  • F1 (carried from B46 / cumulative 43-45): _iso_ts_from_clock is the 7th copy; AZ-508 will consolidate.
  • F2 (carried from B46): spec→implementation drift on C7 API names; affects future C2 strategies AZ-339 / AZ-340.
  • F3 (new): principal-point fallback heuristic uses zero-detection for "no calibration"; safe for production but could be tightened when calibration becomes Optional.

No Critical, High, or Medium findings. AZ-337 may transition to In Testing.