Files
Oleksandr Bezdieniezhnykh 87909cce9f [AZ-340] C2 SelaVPR + EigenPlaces + SALAD secondary VPR backbones
Three new VprStrategy implementations for IT-12 comparative-study
(research binary only, gated OFF for airborne / operator-tooling per
ADR-002). All run via the C7 TensorRT runtime (or ONNX-RT fallback)
with their own concrete BackbonePreprocessor, single-stage L2
normalisation, and FaissBridge-delegated retrieval — same pattern as
AZ-339 (MegaLoc + MixVPR), parametrised in tests for compactness.

  * SelaVprStrategy   — D=512,  input 224x224
  * EigenPlacesStrategy — D=2048, input 480x480
  * SaladStrategy     — D=8448, input 322x322 (DINOv2-Large backbone;
                        heaviest in the C2 family — NFR-perf budget
                        relaxed to 120 ms p95 / 1200 MB GPU per task
                        spec)

The composition-root factory tables and KNOWN_STRATEGIES set were
already pre-wired at AZ-336 land time; module-layout.md already names
all three Internal entries and BUILD_VPR_* rows. No CMake change
required (env-flag gating).

54 unit tests (3 strategies * 18 cases) cover AC-1..AC-11 plus extras
(single-stage L2, NCHW FP16, constructor validation, FDR emission).
All pass; sibling c2_vpr suite + composition-root regression + AZ-526
iso-ts regression all green.

Code review verdict: PASS_WITH_WARNINGS. Two Low findings logged in
batch_51_review.md: F1 escalates `_assert_engine_output_dim`
duplication from 4-way to 7-way (already tracked by AZ-527 hygiene
PBI; will surface in cumulative review batches 49-51); F2 mirrors the
AZ-337 / 338 / 339 AC-10 spec-drift precedent (literal
ConfigurationError vs implemented ConfigError / StrategyNotAvailable).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 00:32:38 +03:00

16 KiB
Raw Permalink Blame History

Code Review Report — Batch 51

Batch: 51 Tasks: AZ-340 (C2 SelaVPR + EigenPlaces + SALAD Secondary Backbones — Research-only, 5 pts) Date: 2026-05-14 Verdict: PASS_WITH_WARNINGS (2 Low findings — one is a carry-over already tracked by AZ-527, the other mirrors the documented AZ-337/338/339 spec-drift precedent)

Phase 1 — Context Loading

Reviewed:

  • _docs/02_tasks/todo/AZ-340_c2_selavpr_eigenplaces_salad.md — task spec, ACs, NFRs, scope, constraints, runtime completeness section.
  • _docs/02_document/contracts/c2_vpr/vpr_strategy_protocol.md (referenced; behavioural Protocol contract).
  • _docs/02_document/components/02_c2_vpr/description.md § 1, § 5, § 6 (referenced; secondary backbone designation, list of strategies including SALAD, preprocessor-duplication policy).
  • _docs/02_document/module-layout.md — Component: c2_vpr Internal entry already pre-declares sela_vpr.py, eigen_places.py, salad.py; BUILD_VPR_SELAVPR / BUILD_VPR_EIGENPLACES / BUILD_VPR_SALAD rows already in the build-flag matrix.
  • Existing AZ-339 implementation (mega_loc.py / mix_vpr.py / _preprocessor_*.py) as the architectural template — AZ-340 is the explicit twin task (research-only secondary backbones via TRT, no PyTorch architecture registry).
  • Composition root runtime_root/vpr_factory.py_STRATEGY_TO_BUILD_FLAG and _STRATEGY_TO_MODULE already include sela_vpr / eigen_places / salad rows (pre-wired at AZ-336 land time).
  • C2 config KNOWN_STRATEGIES already includes the three names.

Files mapped to AZ-340:

Task Files
AZ-340 src/gps_denied_onboard/components/c2_vpr/sela_vpr.py (new), eigen_places.py (new), salad.py (new), _preprocessor_sela_vpr.py (new), _preprocessor_eigen_places.py (new), _preprocessor_salad.py (new), tests/unit/c2_vpr/test_az340_sela_vpr_eigen_places_salad.py (new)

Phase 2 — Spec Compliance Review

AC Status Test reference Notes
AC-1 (each) PASS `test_ac1_protocol_conformance[sela_vpr eigen_places
AC-2 (each) PASS test_ac2_embed_query_returns_unit_norm_fp16_correct_dim[*] + test_ac2_single_stage_l2_no_intra_cluster_call[*] Shape (512,) / (2048,) / (8448,); FP16 dtype; ‖x‖₂ ≈ 1.0 ± 1e-3. Single-stage L2 confirmed by spy normaliser; no intra_cluster_normalise call.
AC-3 (each) PASS test_ac3_embed_query_deterministic_for_same_frame[*] Bit-exact embeddings across 3 calls on same frame.
AC-4 (each) PASS test_ac4_retrieve_topk_returns_exactly_k_with_correct_label[*] len == 10, ascending distances, correct backbone_label, correct descriptor_dim.
AC-5 (each) PASS test_ac5_descriptor_dim_stable[*] 100 calls, returns 512 / 2048 / 8448.
AC-6 (each) PASS test_ac6_create_rejects_engine_output_shape_mismatch[*] + test_ac6_create_rejects_missing_embedding_key[*] ConfigError raised at create-time; the dry-run probe matches each preprocessor's input_shape().
AC-7 (each) PASS test_ac7_runtime_error_yields_vpr_backbone_error[*] + test_ac7_wrong_forward_output_shape_yields_vpr_backbone_error[*] RuntimeError / wrong-shape outputs raise VprBackboneError; ERROR log + FDR record emitted.
AC-8 (each) PASS test_ac8_corrupt_image_yields_vpr_preprocess_error[*] Non-array frame.image raises VprPreprocessError; ERROR log + FDR record emitted.
AC-9 (each) PASS test_ac9_create_emits_ready_log_with_correct_label_and_dim[*] INFO log kind="c2.vpr.ready" with correct {strategy, descriptor_dim}.
AC-10 (each) PASS with drift test_ac10_runtime_label_mismatch_raises_config_error[*] See F2 below — same documented precedent as AZ-337 / AZ-338 / AZ-339.
AC-11 (each) PASS test_ac11_preprocessor_input_shape[*] + test_preprocess_output_is_nchw_fp16[*] (224, 224) / (480, 480) / (322, 322).

Constraint compliance:

  • Each strategy ships its own concrete preprocessor (✓).
  • Centre-crop logic duplicated, NOT shared (✓ — explicit duplication, comment references description.md § 6).
  • All three use TensorRT runtime via InferenceRuntimeCut (✓).
  • No engine compilation in this task (✓ — inference_runtime.compile_engine is the local TRT runtime; the actual .trt file source is C10 / AZ-321).
  • All three hold engine handles, not engines (✓).
  • No GPU operations in __init__ beyond engine load (✓ — engine load is in create(), not in __init__).
  • SALAD's 8448-d output is non-negotiable, no PCA at strategy-level (✓ — DESCRIPTOR_DIM = 8448 is a Final constant; PCA gating left for a future BUILD_VPR_SALAD_PCA flag, out of scope).
  • Logging + FDR records mirror UltraVPR / MegaLoc / MixVPR pattern (✓).

Phase 3 — Code Quality Review

  • SOLID: each strategy is a single class with one responsibility (one backbone). Constructor injection throughout. Module-level create() factory keeps composition concerns out of __init__.
  • Error handling: explicit try / raise for backbone failures; _wrap_backbone_error rewraps general Exception into the typed envelope. No bare except. ERROR logs emitted with structured extra={...}.
  • Naming: SelaVprStrategy / EigenPlacesStrategy / SaladStrategy, _BACKBONE_LABEL, DESCRIPTOR_DIM, _OUTPUT_KEY, _LOG_KIND_*, _FDR_KIND_* — consistent with the existing C2 strategy modules.
  • Complexity: embed_query ≈ 50 lines per strategy (matches MegaLoc / MixVPR; under the 50-line / cyc-10 threshold). create() ≈ 60 lines but mostly orchestration; mirrors the AZ-339 baseline.
  • DRY: significant duplication across three strategies + three preprocessors. Documented as intentional per description.md § 6 + module-level docstrings + tracked by AZ-527 (the planned hygiene PBI scoped at AZ-339 land). See F1 below.
  • Test quality: parametrised across the three strategies; each test asserts meaningful behaviour (shape, dtype, norm, label, error type, log kind, FDR record kind). 54/54 PASS.
  • Dead code: none — ruff check clean across all six new modules and the test file.

Phase 4 — Security Quick-Scan

  • No SQL, no subprocess, no eval / exec, no pickle deserialisation.
  • No hardcoded secrets / credentials / API keys.
  • No external input ingested without validation (_coerce_to_rgb_uint8 rejects non-numpy / wrong-dtype / wrong-shape image bytes).
  • Sensitive data in logs: only frame_id, backbone_label, descriptor_dim, error_type, error-message string truncated to 512 chars. No PII / no calibration intrinsics / no embedding payloads. ✓

Phase 5 — Performance Scan

  • No O(n²) algorithms in the new code paths. FAISS retrieval is delegated to FaissBridge (AZ-341) and runs once per query.
  • No N+1 query patterns (no DB access in c2_vpr).
  • No unbounded data fetching.
  • No blocking I/O in async contexts (codebase is synchronous; the strategies are explicitly single-threaded per INV-1).
  • Memory copies: np.ascontiguousarray(...) on raw[0] produces one copy of the (D,) embedding before normalisation; preprocessor.preprocess produces one (1, 3, H, W) FP16 NCHW buffer. Both are intrinsic to the per-frame VPR contract; no avoidable allocations in the hot path.
  • SALAD's 8448-d output and 322×322 input are acknowledged in the task spec NFR-perf budget (≤ 120 ms / ≤ 1200 MB GPU; ≤ 6 ms FAISS lookup); no performance regression introduced by the strategy code itself — the cost is intrinsic to the chosen backbone.

Phase 6 — Cross-Task Consistency

Batch 51 contains a single task (AZ-340), so cross-task drift inside the batch is not applicable.

Inter-batch consistency with the prior C2 batches (45 / 46 / 47 / 50):

  • All four C2 secondary strategy modules (mega_loc, mix_vpr, sela_vpr, eigen_places, salad — five total now) share the identical TRT-only contract and embed_query / retrieve_topk / descriptor_dim shapes. ✓
  • All four use iso_ts_from_clock from the AZ-526 helper from day one. ✓
  • All four use the local _faiss_bridge.FaissBridge and the AZ-507 inference_runtime_cut / descriptor_index_cut Protocol cuts. ✓
  • All four use ImageNet mean/std normalisation in their preprocessor. ✓

Phase 7 — Architecture Compliance

  1. Layer direction: every import in the six new files resolves to a Layer-1 (_types, helpers, config, logging, fdr_client, clock) or same-component (c2_vpr.*) target. Verified via from ... import survey of each new file. ✓ No layer-direction violations.

  2. Public API respect: no from gps_denied_onboard.components.c6_tile_cache import ... and no from gps_denied_onboard.components.c7_inference import ... in any of the new files. The c6 / c7 surfaces are reached only via the constructor-injected DescriptorIndexCut / InferenceRuntimeCut Protocols (AZ-507 pattern). ✓

  3. No new cyclic module dependencies: the strategy modules are leaf modules within c2_vpr/. No new cycles introduced. ✓

  4. Duplicate symbols across components: all new symbols (SelaVprStrategy, EigenPlacesStrategy, SaladStrategy, *BackbonePreprocessor, DESCRIPTOR_DIM, *_INPUT_HW) live exclusively under src/gps_denied_onboard/components/c2_vpr/. No duplication into other components.

    Within c2_vpr/ itself: the _assert_engine_output_dim helper is now duplicated 7-way (was 4-way after AZ-339, becomes 7-way after AZ-340). This is F1 below; the duplication is intentional and tracked by AZ-527.

  5. Cross-cutting concerns not locally re-implemented: no local _iso_ts_from_clock; no local logger setup; no local config loading; no local FDR record schema. All cross-cutting work goes through the established shared modules. ✓ AZ-526 regression guard (the AST scan in test_az508_iso_timestamps.py) confirms no new module-level _iso_ts_from_clock / _iso_ts_now definitions.

Findings

# Severity Category File:Line Title
F1 Low Maintainability c2_vpr/{sela_vpr,eigen_places,salad}.py (3 new copies) + c2_vpr/{ultra_vpr,net_vlad,mega_loc,mix_vpr}.py (4 pre-existing copies) _assert_engine_output_dim now 7-way duplicated across c2_vpr strategies
F2 Low Spec-Gap (Documentation drift) c2_vpr/{sela_vpr,eigen_places,salad}.py create() runtime-label guard AZ-340 AC-10 literally names ConfigurationError; implementation raises ConfigError (matching AZ-337 / AZ-338 / AZ-339 precedent)

Finding Details

F1: _assert_engine_output_dim 7-way duplicated across c2_vpr strategies (Low / Maintainability)

  • Locations (3 new + 4 pre-existing copies):
    • src/gps_denied_onboard/components/c2_vpr/sela_vpr.py:_assert_engine_output_dim (new)
    • src/gps_denied_onboard/components/c2_vpr/eigen_places.py:_assert_engine_output_dim (new)
    • src/gps_denied_onboard/components/c2_vpr/salad.py:_assert_engine_output_dim (new)
    • src/gps_denied_onboard/components/c2_vpr/ultra_vpr.py:_assert_engine_output_dim (pre-existing)
    • src/gps_denied_onboard/components/c2_vpr/net_vlad.py:_assert_engine_output_dim (pre-existing)
    • src/gps_denied_onboard/components/c2_vpr/mega_loc.py:_assert_engine_output_dim (pre-existing)
    • src/gps_denied_onboard/components/c2_vpr/mix_vpr.py:_assert_engine_output_dim (pre-existing)
  • Description: Each copy is the same ~30-line helper that runs a zero-init dry-run inference, asserts the engine's output dict has an "embedding" key, and asserts the output ndarray shape is (1, DESCRIPTOR_DIM). The only thing that differs across the 7 copies is the local DESCRIPTOR_DIM constant. The duplication is intentional and tracked: each copy carries an in-line comment referencing AZ-527 (the hygiene PBI scoped in parallel with AZ-339 land).
  • Suggestion: extract to a c2-internal helper module c2_vpr/_engine_dim_assertion.py (or similar) under AZ-527. The helper would take descriptor_dim and _assert_engine_output_dim(inference_runtime, handle, preprocessor, descriptor_dim) could be a single shared function. Each strategy keeps its own DESCRIPTOR_DIM constant; the helper reads the dim from the parameter.
  • Why Low (not Medium): this finding is the carry-over of cumulative review batches 46-48 F2 (also Low) — the duplication has been growing since AZ-337 / AZ-338. AZ-527 is already on the backlog; AZ-340 was scoped to land the three secondary strategies, and pulling the consolidation forward would have expanded AZ-340's scope past the spec.
  • Cumulative-review handoff: this finding will surface in the cumulative review for batches 4951 (about to run after batch 51 commits). AZ-527 closes it.
  • Task: AZ-340

F2: AC-10 spec drift — ConfigError raised instead of literally-named ConfigurationError (Low / Spec-Gap, documentation drift)

  • Locations:
    • src/gps_denied_onboard/components/c2_vpr/sela_vpr.py create() runtime-label guard (line ~365)
    • src/gps_denied_onboard/components/c2_vpr/eigen_places.py create() runtime-label guard (line ~365)
    • src/gps_denied_onboard/components/c2_vpr/salad.py create() runtime-label guard (line ~365)
  • Description: AZ-340 AC-10 literally specifies ConfigurationError for the build-flag-OFF case. The implementation raises StrategyNotAvailableError (composition root, env-flag check; thrown BEFORE create() is reached) and ConfigError (strategy module's own runtime-label guard). Both are composition-time fail-fast errors. AZ-337 / AZ-338 / AZ-339 follow this same pattern; AZ-340 mirrors them.
  • Suggestion: amend the AC-10 wording in a future spec pass (AZ-340_c2_selavpr_eigenplaces_salad.md already archived; the documentation drift will be cleared by a single shared spec-pass that touches all five secondary backbones). No code change required. The AC-10 test (test_ac10_runtime_label_mismatch_raises_config_error) carries an in-line docstring that documents the precedent.
  • Task: AZ-340

Baseline Delta

_docs/02_document/architecture_compliance_baseline.md does not exist for this project — Phase 7 ran without baseline-delta partitioning. No tables to emit.

Test Results

  • tests/unit/c2_vpr/test_az340_sela_vpr_eigen_places_salad.py54 / 54 PASS (3 strategies × 18 test cases).
  • tests/unit/c2_vpr/ (full directory: faiss_bridge + net_vlad + ultra_vpr + AZ-339 + AZ-340 + protocol_conformance) — 180 / 180 PASS.
  • tests/unit/test_az270_compose_root.py8 / 8 PASS (no composition-root regressions; the factory routing for sela_vpr / eigen_places / salad was pre-wired at AZ-336 land time).
  • tests/unit/test_az508_iso_timestamps.py18 / 18 PASS (AZ-526 regression guard confirms no new _iso_ts_from_clock / _iso_ts_now duplicates introduced by AZ-340).
  • ruff check on all 7 new files — clean (one auto-fixable RUF022 __all__ not sorted in _preprocessor_eigen_places.py was caught and fixed before commit).

Auto-Fix Eligibility

Finding Severity Eligible? Action
F1 Low / Maintainability Yes (per the auto-fix matrix) NOT auto-fixed in batch 51 — it requires creating an AZ-527 hygiene PBI and is scoped as a separate batch. Logged as a follow-on for AZ-527. The cumulative review for batches 4951 will re-surface it.
F2 Low / Spec-Gap No (Spec-Gap is escalate per matrix) Documentation drift — no code change required. Logged for the next spec-pass.

Verdict

PASS_WITH_WARNINGS — no Critical or High findings. The two Low findings are both pre-existing patterns (F1 is a 4 → 7 escalation of an already-tracked duplication; F2 is the same precedent as AZ-337 / 338 / 339). The implement skill auto-fix gate proceeds to commit without user intervention.

Follow-on Work

  • AZ-527 (Hygiene — consolidate _assert_engine_output_dim into a c2-internal helper). 2 points. Now must consolidate 7 copies, not 4. Depends on AZ-340. To be created and prioritised by the cumulative review for batches 4951.
  • Spec-pass to align AC-10 wording across AZ-337 / AZ-338 / AZ-339 / AZ-340 task specs with the implemented ConfigError / StrategyNotAvailableError envelope. Out of scope for the implement skill; should be scheduled in a documentation-sync batch.