[AZ-337] C2 UltraVPR primary backbone VprStrategy

UltraVPR is the Documentary Lead's PRIMARY backbone per description.md § 1 and is wired by default (config.c2_vpr.strategy = "ultra_vpr"). Runs on the C7 TensorRT runtime (AZ-298) or ONNX-Runtime fallback (AZ-299); explicitly NOT on the PyTorch FP16 runtime so a TRT engine compile bug can fall back to NetVLAD without simultaneously breaking both strategies. Production changes: - c2_vpr/ultra_vpr.py - UltraVprStrategy + module-level create() factory. embed_query pipeline: preprocess -> runtime.infer -> single-stage L2 -> VprQuery. retrieve_topk delegates one-line to FaissBridge. Engine load + output-shape assertion happen at create() time (AC-6) so misconfiguration surfaces at startup, not 17 minutes into a flight. UltraVPR has D=512 fixed (NOT a config knob; AC-5 / AC-6 / AC-7 all assume 512). Single-stage L2 (no intra-cluster step like NetVLAD; spy-test enforces this so a future refactor cannot silently regress recall). - c2_vpr/_preprocessor_ultra_vpr.py - centre-crop using the camera calibration's principal point (cx, cy from intrinsics_3x3), falling back to geometric centre + WARN log when calibration is absent (AC-9). Resize -> (384, 384) -> ImageNet mean/std -> FP16 NCHW. - No composition-root changes: UltraVPR consumes a pre-compiled .trt engine (no PyTorch nn.Module), so the strategy module does NOT expose MODEL_NAME / architecture_factory. The composition- root _register_strategy_architecture helper no-ops cleanly for this case (verified by test_create_does_not_register_pytorch_architecture). Tests: - tests/unit/c2_vpr/test_ultra_vpr.py - 29 tests covering all 12 ACs + preprocessor contract + constructor validation + FDR record emission + single-stage L2 enforcement. Full unit suite: 1637 passed / 80 env-skipped (+29 new tests). Per-batch code review (batch_47_review.md): PASS_WITH_WARNINGS (3 Low-severity findings; no Critical / High / Medium): - F1: _iso_ts_from_clock is now the 7th copy (AZ-508 will close). - F2: AZ-337 spec uses outdated C7 API names; affects upcoming AZ-339 / AZ-340. Spec-hygiene PBI recommended. - F3: principal-point fallback uses (0, 0) zero-detection for missing calibration; safe but tightens when intrinsics become Optional. Architectural notes: - AZ-507 layering clean. Imports only InferenceRuntimeCut, DescriptorIndexCut, c2_vpr internals, _types, helpers, clock, fdr_client. Architecture lint test passes. - Pattern parity with NetVLAD (B46) where semantics permit; UltraVPR-specific paths (single-stage L2, 'embedding' output key, TRT runtime, no architecture registry, principal-point crop) are clearly localised. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-22 22:41:13 +00:00 · 2026-05-13 22:43:17 +03:00
parent 773d589d34
commit 3c4fd272f1
5 changed files with 1692 additions and 1 deletions
@@ -0,0 +1,165 @@
+# Batch 47 / Cycle 1 — Per-Batch Code Review
+
+**Date**: 2026-05-13
+**Tasks**: AZ-337 — C2 UltraVPR Primary Backbone (5pt)
+**Reviewer**: autodev orchestrator (inline review)
+**Verdict**: `PASS_WITH_WARNINGS` — three Low-severity findings, none blocking.
+
+## Scope
+
+- `src/gps_denied_onboard/components/c2_vpr/ultra_vpr.py` (new, 372 lines)
+- `src/gps_denied_onboard/components/c2_vpr/_preprocessor_ultra_vpr.py` (new, 188 lines)
+- `tests/unit/c2_vpr/test_ultra_vpr.py` (new, 581 lines, 29 tests)
+- `_docs/_autodev_state.md` (state bump)
+
+No modifications to existing production files. The composition-root
+helper `_register_strategy_architecture` already no-ops cleanly for
+strategies that do not expose `MODEL_NAME` / `architecture_factory` —
+exactly the design path AZ-337 takes.
+
+## Acceptance-Criteria Coverage Verification
+
+12/12 ACs have at least one covering test:
+
+| AC | Test(s) |
+|----|---------|
+| AC-1 (Protocol conformance) | `test_ac1_protocol_conformance` |
+| AC-2 (L2-norm FP16 (512,)) | `test_ac2_embed_query_returns_unit_norm_fp16_512`, `test_ac2_embedding_is_single_stage_l2_no_intra_cluster_path` |
+| AC-3 (deterministic) | `test_ac3_embed_query_deterministic_for_same_frame` |
+| AC-4 (`retrieve_topk == k`, sorted, label) | `test_ac4_retrieve_topk_returns_exactly_k_with_ultra_vpr_label` |
+| AC-5 (`descriptor_dim()` stable returns 512) | `test_ac5_descriptor_dim_stable_returns_512` |
+| AC-6 (engine output shape mismatch → `ConfigError`) | `test_ac6_create_rejects_engine_output_shape_mismatch`, `test_ac6_create_rejects_engine_with_missing_embedding_key` |
+| AC-7 (`VprBackboneError` on forward fail + ERROR log + FDR) | `test_ac7_runtime_error_yields_vpr_backbone_error`, `test_ac7_missing_embedding_key_yields_vpr_backbone_error`, `test_ac7_wrong_forward_output_shape_yields_vpr_backbone_error` |
+| AC-8 (`VprPreprocessError` on corrupt image + ERROR log + FDR) | `test_ac8_corrupt_image_yields_vpr_preprocess_error`, `test_ac8_wrong_dtype_image_yields_vpr_preprocess_error` |
+| AC-9 (calibration absent → geometric centre + WARN) | `test_ac9_identity_calibration_falls_back_to_geometric_centre`, `test_ac9_principal_point_offset_changes_crop_window` |
+| AC-10 (`IndexUnavailableError` re-raised unchanged) | `test_ac10_index_unavailable_propagates_unchanged` |
+| AC-11 (composition-root wiring + BUILD-flag gate) | `test_ac11_create_emits_strategy_ready_info_log`, `test_ac11_non_trt_runtime_rejected_at_create`, `test_ac11_onnx_trt_ep_runtime_accepted_at_create` |
+| AC-12 (top-1 > threshold → WARN via FaissBridge) | `test_ac12_top1_above_threshold_emits_warn_via_faiss_bridge` |
+
+## Tests
+
+- `tests/unit/c2_vpr/test_ultra_vpr.py`: **29 / 29 PASS** in 2.5s.
+- Full unit suite: **1637 passed / 80 skipped / 0 failed** in ~66s.
+  Up from 1608 at the close of Batch 46 (+29 new tests).
+- `ruff check` on all new + modified files: clean.
+
+## Architectural Review
+
+### F1 — `_iso_ts_from_clock` is now the 7th copy (Low / Maintainability, carried)
+
+`_iso_ts_from_clock` appears verbatim in `ultra_vpr.py` lines 296-303,
+matching the same helper in `net_vlad.py`, `_faiss_bridge.py`,
+`c11_*`, `c12_*`, `c6_tile_cache.postgres_filesystem_store`, and
+`c6_tile_cache.freshness_gate`. This is the 7th identical copy.
+Already tracked by **AZ-508** ("ISO timestamp consolidation, 2pt").
+Recommend prioritising AZ-508 before the remaining C2 strategies
+(AZ-339, AZ-340) add copies #8 and #9.
+
+### F2 — Spec→implementation drift on C7 API names (Low / Spec-Hygiene)
+
+The AZ-337 spec § Outcome uses outdated C7 API names:
+
+- Spec: `runtime.forward(engine_id, {...})["embedding"]`
+- Live: `runtime.infer(handle, {...})` returning `dict[str, ndarray]`
+- Spec: `runtime.load_engine(weights_path)`
+- Live: `runtime.compile_engine(model_path, build_config)` →
+  `EngineCacheEntry`, then `deserialize_engine(entry)` → `EngineHandle`
+
+Same drift was flagged in Batch 46 (AZ-338) review as F2. The
+implementation aligns with the live v1.0.0 Protocol (AZ-297); spec
+text is stale. **Recommendation**: a spec-hygiene PBI for
+AZ-339 / AZ-340 / AZ-358 / AZ-349 to refresh references to the
+v1.0.0 C7 Protocol BEFORE those tasks are picked up — otherwise
+each batch repeats the same "spec said X, code does Y" review note.
+
+### F3 — Principal-point fallback heuristic relies on identity-matrix detection (Low / Test-Robustness)
+
+`UltraVprBackbonePreprocessor._extract_principal_point` treats
+`(cx, cy) == (0, 0)` as "no calibration data" because the test
+fixture uses `np.eye(3)` for "missing" calibration. A real camera
+calibration with a genuine principal point near the top-left
+(unusual but legal for cropped sensors) would also be skipped to
+geometric-centre fallback. The heuristic is correct for production
+(intrinsics zeroed → no calibration) but the test fixture trick
+would benefit from a `None` sentinel or a flag rather than relying
+on the zero-equality check. **Risk is bounded**: no real camera
+has `cx == 0 and cy == 0`; the worst-case is a one-frame mis-crop
+with a graceful WARN log. Not blocking. **Recommendation**: when a
+real `intrinsics_3x3 == None` path lands (currently the dataclass
+field is `Any` not `Optional`), tighten the type annotation and
+remove the zero-detection branch.
+
+### Architecture Notes (Strengths)
+
+1. **AZ-507 layering clean**. UltraVPR consumes
+   `InferenceRuntimeCut` + `DescriptorIndexCut` (both C2-owned
+   structural cuts), never `components.c7_inference` or
+   `components.c6_tile_cache` directly. Architecture lint test
+   `test_ac6_only_compose_root_imports_concrete_strategies` PASS.
+2. **No PyTorch architecture registration**. UltraVPR is the first
+   strategy that does NOT register a NN architecture
+   (TRT-engine-only, no PyTorch fallback). Verified by
+   `test_create_does_not_register_pytorch_architecture`. The
+   composition-root `_register_strategy_architecture` helper
+   no-ops cleanly for this case — no code change needed there.
+3. **Engine load + output-shape assertion at `create` time**.
+   Failure surfaces at composition time, NOT at first frame.
+   Matches Constraint § 5 of the task spec.
+4. **Single-stage L2 normalisation**. Explicitly verified to NOT
+   call `intra_cluster_normalise` (NetVLAD's two-stage path).
+   This is a regression-blocking spy in `test_ac2_embedding_is_single_stage_l2_no_intra_cluster_path` — if a
+   future refactor accidentally adds the intra-cluster step,
+   recall would silently degrade on the Derkachi corpus.
+5. **Constructor-injection only**. No `import` of
+   `gps_denied_onboard.config` inside `ultra_vpr.py`; config is
+   consumed exclusively through the `create()` factory parameter.
+6. **Pattern parity with NetVLAD**. `embed_query` / `retrieve_topk`
+   / `_emit_*` shape mirrors `NetVladStrategy` line-for-line where
+   semantics permit; UltraVPR-specific paths (single-stage L2,
+   `"embedding"` output key, TRT runtime, no architecture
+   registry) are clearly localised.
+7. **AC-12 delegation**. `FaissBridge` already owns the
+   top-1-distance WARN log; UltraVPR inherits this for free via
+   the same delegation that NetVLAD uses — one production
+   touchpoint, two strategies. Confirmed by direct test.
+8. **Calibration consumption matches spec**. The principal-point
+   crop is the documented UltraVPR preprocessing; the geometric-
+   centre fallback path with WARN log satisfies AC-9 exactly.
+
+## Performance
+
+Performance NFR (C2-PT-01 `embed_query` p95 ≤ 60 ms on Tier-1 Jetson
+Orin) is deferred to E-BBT per task spec § NFRs. Macbook dev tier
+has no TRT 10.3 + Jetson Orin to benchmark against. **Carry-over to
+the next cumulative review**: F3 from the 43-45 cumulative report
+already tracks "Tier-1 perf microbenchmarks deferred"; AZ-337 adds
+to that backlog.
+
+## Comparison vs Batch 46 (AZ-338)
+
+| Aspect | NetVLAD (B46) | UltraVPR (B47) |
+|--------|---------------|----------------|
+| Runtime label | `pytorch_fp16` | `tensorrt` / `onnx_trt_ep` |
+| Engine input | `.pth` state dict | `.trt` engine file |
+| Architecture registry | binds factory | no-op |
+| Descriptor dim | 4096 (configurable PCA) | 512 (fixed) |
+| Normalisation | intra-cluster THEN L2 | L2 only |
+| Output key | `vlad_descriptor` | `embedding` |
+| Input shape | `(480, 480)` | `(384, 384)` |
+| Calibration use | ignored | principal-point crop |
+| Test count | 31 | 29 |
+
+## Verdict
+
+`PASS_WITH_WARNINGS`. Three Low-severity findings, none blocking:
+
+- **F1** (carried from B46 / cumulative 43-45): `_iso_ts_from_clock`
+  is the 7th copy; AZ-508 will consolidate.
+- **F2** (carried from B46): spec→implementation drift on C7 API
+  names; affects future C2 strategies AZ-339 / AZ-340.
+- **F3** (new): principal-point fallback heuristic uses zero-detection
+  for "no calibration"; safe for production but could be tightened
+  when calibration becomes `Optional`.
+
+No Critical, High, or Medium findings. AZ-337 may transition to
+**In Testing**.
@@ -8,7 +8,7 @@ status: in_progress
 sub_step:
  phase: 7
  name: batch-loop
-  detail: "batch 46 complete — selecting batch 47"
+  detail: "batch 47 — AZ-337 (C2 UltraVPR primary backbone)"
 retry_count: 0
 cycle: 1
 tracker: jira