# C2 MegaLoc + MixVPR Secondary Backbones **Task**: AZ-339_c2_megaloc_mixvpr **Name**: C2 MegaLoc + MixVPR Secondary Backbones (Research-only) **Description**: Implement `MegaLocStrategy` and `MixVprStrategy`, two secondary `VprStrategy` backbones used for IT-12 comparative-study purposes (research binary only). Both run on the C7 TensorRT runtime (same path as UltraVPR; FP16 engines compiled by C10) but are gated OFF for airborne and operator-tooling per ADR-002 — they're available only in the research binary and (selectable) replay-cli. Each strategy ships its own concrete `BackbonePreprocessor` (different resize target and normalisation per upstream code drop). Embeddings: MegaLoc D=2048, MixVPR D=4096. Both produce L2-normalised embeddings; both delegate `retrieve_topk` to the C6 TileStore Public API. Neither is on the production critical path; performance NFRs are looser than UltraVPR. **Complexity**: 5 points **Dependencies**: AZ-336_c2_vpr_strategy_protocol, AZ-263_initial_structure, AZ-269_config_loader, AZ-298_c7_tensorrt_runtime, AZ-303_c6_storage_interfaces, AZ-283_descriptor_normaliser, AZ-281_engine_filename_schema, AZ-321_c10_engine_compiler, AZ-266_log_module, AZ-272_fdr_record_schema **Component**: c2_vpr (epic AZ-255 / E-C2) **Tracker**: AZ-339 **Epic**: AZ-255 (E-C2) ### Document Dependencies - `_docs/02_document/contracts/c2_vpr/vpr_strategy_protocol.md` — Protocol contract; both strategies satisfy every invariant. - `_docs/02_document/components/02_c2_vpr/description.md` — § 1 secondary backbones for IT-12 comparative study; § 5 backbone library list. - `_docs/02_document/module-layout.md` — `c2_vpr.mega_loc` and `c2_vpr.mix_vpr` Internal entries; `BUILD_VPR_MEGALOC` and `BUILD_VPR_MIXVPR` rows (both OFF for airborne/operator-tooling, ON for research; replay-cli inherits research selection at config time). - `_docs/02_document/contracts/c7_inference/inference_runtime_protocol.md` — `InferenceRuntime` interface (TRT runtime). - `_docs/02_document/contracts/shared_helpers/descriptor_normaliser.md` — L2 normalisation. ## Problem Without this task: - The IT-12 comparative-study cannot enumerate MegaLoc and MixVPR alongside UltraVPR / NetVLAD; researchers cannot quantify whether UltraVPR's PRIMARY designation is justified against the broader VPR-backbone landscape. - The research binary's link surface is incomplete; the comparative-study CI matrix entry that asserts the research binary contains every secondary backbone fails. - A future cycle that wants to swap MegaLoc to PRIMARY (e.g., if UltraVPR's upstream code drop becomes unmaintained) would have no migration path — the strategy class would not yet exist. ## Outcome - `src/gps_denied_onboard/components/c2_vpr/mega_loc.py` defining `MegaLocStrategy` (Protocol-conforming) + `create(config, tile_store, inference_runtime)` factory entry-point. - Constructor signature: `__init__(self, runtime, tile_store, weights_path, preprocessor, normaliser, fdr_client)`. - `embed_query`: preprocess → TRT forward → L2 normalise → return `VprQuery`. - `retrieve_topk`: delegate to `tile_store.faiss_topk`; return `VprResult` with `backbone_label="mega_loc"`, `descriptor_dim=2048`. - `descriptor_dim() -> int`: returns 2048; engine output shape asserted at load. - `src/gps_denied_onboard/components/c2_vpr/_preprocessor_mega_loc.py` defining `MegaLocBackbonePreprocessor`: - `input_shape() -> (322, 322)` per upstream MegaLoc default. - Normalisation: ImageNet mean/std (same as UltraVPR — common upstream convention; not a coupling, both happen to use ImageNet). - Centre-crop with calibration-aware logic (same pattern as UltraVPR / NetVLAD; copied not shared per description.md § 6). - Output dtype FP16, NCHW. - `src/gps_denied_onboard/components/c2_vpr/mix_vpr.py` defining `MixVprStrategy` (mirrors `MegaLocStrategy` structure): - `backbone_label="mix_vpr"`, `descriptor_dim=4096`. - `src/gps_denied_onboard/components/c2_vpr/_preprocessor_mix_vpr.py` defining `MixVprBackbonePreprocessor`: - `input_shape() -> (320, 320)` per upstream MixVPR default. - Normalisation: ImageNet mean/std. - Output dtype FP16, NCHW. - Composition-root wiring paths for `config.vpr.strategy in {"mega_loc", "mix_vpr"}`. - `BUILD_VPR_MEGALOC` and `BUILD_VPR_MIXVPR` CMake flags wired per ADR-002. - Logging per description.md § 9 (INFO ready, WARN top-1-above-threshold, ERROR / FDR per error path). - Engine output shape assertion at load for both strategies. - Unit tests covering Protocol conformance, L2-normalisation, deterministic embeddings, top-K invariants, error paths — for BOTH strategies. ## Scope ### Included - Both `MegaLocStrategy` and `MixVprStrategy` classes implementing the Protocol. - Both concrete `BackbonePreprocessor` implementations (one per strategy; preprocessing parameters per upstream code drop). - Module-level `create` factory functions for both. - Composition-root wiring for both strategy choices. - Engine output shape assertion at load for both. - Logging + FDR records identical pattern to UltraVPR (per-backbone `backbone_label`). - Unit tests for both strategies covering invariants + error paths. - `BUILD_VPR_MEGALOC` and `BUILD_VPR_MIXVPR` CMake flag wiring. ### Excluded - The `VprStrategy` Protocol — owned by AZ-336. - Shared `DescriptorNormaliser` — already AZ-283. - C7 TensorRT runtime — owned by AZ-298. - Engine compilation — owned by AZ-321. - Other backbones — AZ-337 (UltraVPR), AZ-338 (NetVLAD), AZ-340 (SelaVPR + EigenPlaces + SALAD). - FAISS retrieve wiring — owned by AZ-341. - Recall@10 acceptance tests for these secondary backbones — deferred to Step 9 / E-BBT (and the floors are looser per the engine rule — these are research-only, not engine-rule-binding). ## Acceptance Criteria **AC-1 (per strategy): Protocol conformance** Given a constructed `MegaLocStrategy` AND a constructed `MixVprStrategy` When `isinstance(strategy, VprStrategy)` is evaluated Then both return `True` **AC-2 (per strategy): `embed_query` produces L2-normalised FP16 embedding of correct dim** Given a valid `NavCameraFrame` and `CameraCalibration` When `embed_query` is called on each strategy Then MegaLoc returns `embedding.shape == (2048,)`, MixVPR returns `embedding.shape == (4096,)`; both are `dtype == np.float16`; both have `||embedding||_2 == 1.0 ± 1e-3` **AC-3 (per strategy): Deterministic embeddings** Given the same frame When `embed_query` is called 3 times Then bit-exact embeddings (ULP-tolerant FP16) for each strategy **AC-4 (per strategy): `retrieve_topk` returns exactly k candidates with correct backbone_label** Given a corpus of 100 tiles per strategy's descriptor_dim + a constructed `VprQuery` When `retrieve_topk(query, k=10)` is called on each strategy Then `len(candidates) == 10`, sorted ascending; `backbone_label == "mega_loc"` for MegaLoc; `backbone_label == "mix_vpr"` for MixVPR; `descriptor_dim` matches **AC-5 (per strategy): `descriptor_dim()` is stable** Given a constructed strategy When `descriptor_dim()` is called 100 times Then MegaLoc returns 2048 every call; MixVPR returns 4096 every call **AC-6 (per strategy): Engine output shape mismatch → `ConfigurationError`** Given a TRT engine whose output tensor shape does not match the strategy's expected `descriptor_dim` When `create(...)` is called Then `ConfigurationError` is raised; the strategy is NOT instantiated **AC-7 (per strategy): `VprBackboneError` on forward-pass failure** Given an `InferenceRuntime` test double that raises When `embed_query` is called Then `VprBackboneError` is raised; ERROR log + FDR record emitted **AC-8 (per strategy): `VprPreprocessError` on corrupt image bytes** Given a frame with malformed `image_bytes` When `embed_query` is called Then `VprPreprocessError` is raised; ERROR log + FDR record emitted **AC-9 (per strategy): Composition-root wiring** Given `config.vpr.strategy = "mega_loc"` (resp. `"mix_vpr"`) AND valid weights AND matching `descriptor_dim` When `compose_root(config)` runs Then the corresponding strategy is wired; AZ-336 factory's pre-flight `descriptor_dim` validation passes; INFO log `kind="c2.vpr.ready"` with `{strategy: "mega_loc", descriptor_dim: 2048}` (resp. `mix_vpr` / 4096) is emitted **AC-10 (per strategy): Build-flag exclusion in airborne binary** Given `config.vpr.strategy = "mega_loc"` (resp. `"mix_vpr"`) AND `BUILD_VPR_MEGALOC=OFF` (resp. `BUILD_VPR_MIXVPR=OFF`) — the airborne case When the binary tries to load Then `ConfigurationError` is raised at composition-root time with message containing the missing flag; the binary refuses to start (fail-fast per AZ-336 factory's lazy-import → ImportError → `ConfigurationError` mapping) **AC-11 (per strategy): Preprocessing input shape** Given the strategy's preprocessor instance When `input_shape()` is called Then MegaLoc returns `(322, 322)`; MixVPR returns `(320, 320)` ## Non-Functional Requirements **Performance** (looser than UltraVPR — research-only, not on production critical path): - MegaLoc `embed_query` p95 ≤ 80 ms on Tier-1 Jetson Orin (FP16 TRT). - MixVPR `embed_query` p95 ≤ 100 ms on Tier-1 Jetson Orin (FP16 TRT) — slightly higher because MixVPR's mix-net is ~30% larger than UltraVPR's backbone. - `retrieve_topk` p95: MegaLoc ≤ 3 ms, MixVPR ≤ 4 ms (4096-d FAISS HNSW slower than 512-d). - GPU memory per strategy: MegaLoc ≤ 700 MB; MixVPR ≤ 800 MB resident. - These NFRs are research-side guidance; not engine-rule blockers. **Compatibility** - Both consume TRT engines produced by AZ-321 with the AZ-281 self-describing filename schema. - Upstream code drops pinned per Plan-phase; weight-format changes between drops require engine rebuild. **Reliability** - Both strategies single-threaded by contract. - Both use unconditional L2-normalisation (INV-3). - Errors do not crash the process; downstream falls back to VIO-only. ## Unit Tests | AC Ref | What to Test | Required Outcome | |--------|-------------|-----------------| | AC-1 (MegaLoc) | `isinstance(MegaLocStrategy(...), VprStrategy)` | `True` | | AC-1 (MixVPR) | `isinstance(MixVprStrategy(...), VprStrategy)` | `True` | | AC-2 (MegaLoc) | `embed_query` output | shape (2048,), dtype float16, L2-norm ≈ 1.0 | | AC-2 (MixVPR) | `embed_query` output | shape (4096,), dtype float16, L2-norm ≈ 1.0 | | AC-3 (each) | `embed_query` × 3 same frame | bit-exact embeddings (ULP-tolerant) | | AC-4 (each) | `retrieve_topk` against fixture corpus | `len == 10`, sorted, correct `backbone_label`, correct `descriptor_dim` | | AC-5 (each) | `descriptor_dim()` × 100 | always returns the correct dim | | AC-6 (each) | TRT engine with wrong output shape | `ConfigurationError` at create time | | AC-7 (each) | `forward` raises | `VprBackboneError`; ERROR log + FDR | | AC-8 (each) | malformed `image_bytes` | `VprPreprocessError`; ERROR log + FDR | | AC-9 (each) | `compose_root(config=)` | wired; INFO log with correct backbone label and dim | | AC-10 (each) | airborne binary + strategy chosen | `ConfigurationError` with missing-flag message; fail-fast | | AC-11 (MegaLoc) | `MegaLocBackbonePreprocessor.input_shape()` | returns `(322, 322)` | | AC-11 (MixVPR) | `MixVprBackbonePreprocessor.input_shape()` | returns `(320, 320)` | | Preprocess-shape (each) | `preprocess(frame)` output | NCHW shape `(1, 3, H, W)`, dtype float16 | ## Constraints - **Each strategy ships its own concrete preprocessor** — preprocessing parameters per upstream code drop (description.md § 6 "C2-internal helper, NOT a shared helper"). - **Preprocessing parameters are weights-coupled** — `(322, 322)` for MegaLoc, `(320, 320)` for MixVPR; ImageNet mean/std for both. Hard-coded; not config-knobs. - **Centre-crop logic is duplicated, NOT shared** — copying preprocessing between strategies is intentional per the contract; sharing would couple weights-versions across strategies and let one strategy's upgrade silently break another's preprocessing. - **Both use TensorRT runtime** (consistent with UltraVPR's path); the difference between secondary and primary is not the runtime but the build-flag ON/OFF in airborne. - **No engine compilation in this task** — the `.trt` engine files come from AZ-321; this task consumes them via `config.vpr.backbone_weights_path`. - **Both strategies hold engine IDs returned by `inference_runtime.load_engine`, NOT engines themselves**. - **No GPU operations in `__init__` beyond engine load** — same constraint as UltraVPR. ## Risks & Mitigation **Risk 1: MegaLoc and MixVPR upstream code drops use different ONNX op sets that TRT 10.3 partially supports** - *Risk*: Engine compilation succeeds but with fallback layers that don't run on GPU; `embed_query` p95 inflates. - *Mitigation*: AZ-321 (engine compile) is responsible for detecting fallback layers and reporting them. This task consumes the produced engine; if NFR-perf budgets are violated, AZ-321 escalates the upstream support gap. **Risk 2: Higher embedding dim (4096 for MixVPR) inflates corpus storage requirements** - *Risk*: A research binary that switches between UltraVPR (D=512) and MixVPR (D=4096) needs to rebuild the FAISS corpus every swap; researchers may forget. - *Mitigation*: AZ-336 factory's pre-flight `descriptor_dim` validation catches the mismatch at startup with a clear error message. Researchers must rebuild the corpus (C10) before swapping; the helpful error tells them so. **Risk 3: MegaLoc / MixVPR are research-only — operators may select them by mistake** - *Risk*: A typo or copy-pasted research config selects MegaLoc / MixVPR on an airborne binary; cold start fails. - *Mitigation*: AC-10 ensures fail-fast at composition-root with a clear message. Operators learn at startup, not after takeoff. **Risk 4: Test fixtures for MegaLoc / MixVPR engines don't exist in CI** - *Risk*: Without TRT engines for these strategies, the unit tests cannot exercise the full `embed_query` path; they're stubbed via `FakeInferenceRuntime`. - *Mitigation*: This is fine — Step 9 / E-BBT validates the real engine path against C2-IT-01 and the C2-PT-01 NFR. The unit tests validate Protocol conformance + invariants; they don't need real engines. **Risk 5: Preprocessing duplication across strategies invites subtle bugs** - *Risk*: A bug fix to UltraVPR's centre-crop logic doesn't propagate to MegaLoc / MixVPR. - *Mitigation*: This is the documented trade-off (description.md § 6). The duplication is intentional. If a bug fix is needed across strategies, each strategy's preprocessor is updated explicitly with a coordinated commit; cross-checking is part of code review. ## Runtime Completeness - **Named capability**: secondary `VprStrategy` implementations for IT-12 comparative-study (architecture / E-C2 / `solution.md` "MegaLoc, MixVPR secondary backbones"). - **Production code that must exist**: real `MegaLocStrategy` and `MixVprStrategy` classes calling real C7 TRT `InferenceRuntime.forward` with real loaded `.trt` engines; real concrete preprocessors with real OpenCV resize + ImageNet normalisation + FP16 cast; real L2-normalisation; real composition-root wiring paths. - **Allowed external stubs**: tests MAY use `FakeInferenceRuntime` returning pre-computed embeddings; `FakeTileStore`; `FakeFdrClient`; production wiring uses real C7 + real engines + real C6. - **Unacceptable substitutes**: NumPy-only forward passes (would not satisfy NFR budgets); skipping L2-normalisation (would break INV-3); shared preprocessors across strategies (would defeat description.md § 6 isolation); selecting these strategies in airborne binaries (must fail-fast per AC-10); engine load at first frame (would defer the engine-output-shape assertion past startup); per-strategy thread safety (the contract is single-thread).