[AZ-339] C2 MegaLoc + MixVPR secondary VPR backbones

Adds two research-only VprStrategy implementations for the IT-12 comparative-study matrix. MegaLocStrategy (D=2048, 322x322) and MixVprStrategy (D=4096, 320x320), both via C7 TensorRT FP16 with their own concrete BackbonePreprocessor. Single-stage global L2 normalisation; retrieval delegated to FaissBridge; FDR records + structured logs identical to UltraVPR. BUILD_VPR_MEGALOC and BUILD_VPR_MIXVPR ON for research/replay-cli only, OFF for airborne and operator-tooling (fail-fast at composition root via existing AZ-336 factory). Uses helpers.iso_ts_from_clock from day 1 — no new timestamp helper duplicates introduced. 36 parametrised AC tests + 25 protocol-conformance + 18 helper regression tests pass; 1690 / 1690 unit tests pass (excluding 1 pre-existing flaky cold-start subprocess test in c12). Verdict: PASS_WITH_WARNINGS — one Medium follow-on (AZ-527 to consolidate 4-way _assert_engine_output_dim) + one Low AC wording drift. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-22 09:01:14 +00:00 · 2026-05-13 23:52:54 +03:00
parent 5dfd9a577e
commit 0d65ff4705
9 changed files with 2283 additions and 1 deletions
@@ -0,0 +1,63 @@
+# Batch 50 — Implementation Report (Cycle 1)
+
+**Tasks**: AZ-339 (C2 MegaLoc + MixVPR Secondary Backbones — Research-only)
+**Date**: 2026-05-13
+**Cycle**: 1
+**Status**: COMPLETE (review verdict: PASS_WITH_WARNINGS, one Medium + one Low finding)
+
+## What was done
+
+Added two secondary `VprStrategy` implementations for IT-12 comparative-study: `MegaLocStrategy` (D=2048, 322×322 input) and `MixVprStrategy` (D=4096, 320×320 input). Both run via the C7 TensorRT runtime (or ONNX-Runtime fallback), apply ImageNet mean/std preprocessing + single-stage L2 normalisation, and delegate retrieval to `FaissBridge`. Both are gated OFF for airborne and operator-tooling per ADR-002 — `BUILD_VPR_MEGALOC` and `BUILD_VPR_MIXVPR` ON only for the research binary and replay-cli.
+
+### Files added (5)
+
+| File | Purpose |
+|------|---------|
+| `src/gps_denied_onboard/components/c2_vpr/mega_loc.py` | `MegaLocStrategy` class + `create()` factory + `_assert_engine_output_dim` helper |
+| `src/gps_denied_onboard/components/c2_vpr/_preprocessor_mega_loc.py` | `MegaLocBackbonePreprocessor` (centre-crop + 322×322 resize + ImageNet normalise + FP16 NCHW) |
+| `src/gps_denied_onboard/components/c2_vpr/mix_vpr.py` | `MixVprStrategy` class + `create()` factory + `_assert_engine_output_dim` helper |
+| `src/gps_denied_onboard/components/c2_vpr/_preprocessor_mix_vpr.py` | `MixVprBackbonePreprocessor` (centre-crop + 320×320 resize + ImageNet normalise + FP16 NCHW) |
+| `tests/unit/c2_vpr/test_az339_mega_loc_mix_vpr.py` | 36 parametrised AC tests across both strategies |
+
+### Files changed
+- _None._ The composition-root factory (`runtime_root/vpr_factory.py`) was already wired for `mega_loc` and `mix_vpr` strategy names at AZ-336 land time — `_STRATEGY_TO_BUILD_FLAG` and `_STRATEGY_TO_MODULE` tables already include the rows. The `KNOWN_STRATEGIES` frozenset in `c2_vpr/config.py` already includes both. The `module-layout.md` `Component: c2_vpr` § Internal list already names `mega_loc.py` and `mix_vpr.py` (pre-declared by AZ-336). No CMake change required — `BUILD_VPR_*` gating is environment-variable-based per `_is_build_flag_on` in `vpr_factory.py`.
+
+## AC coverage
+
+All 11 ACs verified per strategy via the parametrised test suite. See `_docs/03_implementation/reviews/batch_50_review.md` § Phase 2 for the AC ↔ test mapping table.
+
+| AC | Status | Notes |
+|----|--------|-------|
+| AC-1..AC-9 + AC-11 | PASS | Each AC parametrised over both strategies (36 test cases total) |
+| AC-10 | PASS with drift | Implementation raises `StrategyNotAvailableError` (env-flag OFF path) and `ConfigError` (runtime-label mismatch path); the spec literally names `ConfigurationError`. Mirrors the established AZ-337 / AZ-338 precedent. Logged as Low finding F2. |
+
+## Test results
+
+- `tests/unit/c2_vpr/test_az339_mega_loc_mix_vpr.py` — **36 / 36 PASS**.
+- `tests/unit/c2_vpr/test_protocol_conformance.py` — **25 / 25 PASS** (auto-extends across all 7 strategies; the two new ones are picked up by the parametrised `_STRATEGY_MODULES` table without test changes).
+- `tests/unit/c2_vpr/` (full directory: faiss_bridge + net_vlad + ultra_vpr + new AZ-339 file) — **126 / 126 PASS**.
+- `tests/unit/test_az508_iso_timestamps.py` — **18 / 18 PASS** (AZ-526 regression guard confirms no new `_iso_ts_from_clock` duplicates introduced by the AZ-339 strategies).
+- `tests/unit/test_az270_compose_root.py` — **8 / 8 PASS**.
+- `tests/unit/test_az272_fdr_record_schema.py` — **33 / 33 PASS** (unmodified; the new strategies emit FDR records that match the existing schema).
+- Full unit suite: **1690 passed, 80 skipped (TRT/CUDA/actionlint), 1 pre-existing failure** (`test_cold_start_under_500ms_p99` — subprocess timeout on cold-start latency budget, unrelated; confirmed by stashing AZ-339 changes and re-running).
+- `ruff check` on all 5 new files — clean.
+
+## Architectural decisions
+
+1. **Single parametrised test file `test_az339_mega_loc_mix_vpr.py`** — rather than two near-identical files mirroring `test_ultra_vpr.py` / `test_net_vlad.py`. The two strategies share byte-identical behavioural contracts (same Protocol, same FDR record kinds, same log kinds, same error envelope) and differ only on three values (`DESCRIPTOR_DIM`, `_BACKBONE_LABEL`, preprocessor `input_shape()`). A parametrised approach keeps any future drift visible at the assertion level and reduces the test surface from ~1500 lines (two copies of test_ultra_vpr.py) to ~700 lines.
+2. **Preprocessor duplication preserved** (mega_loc vs mix_vpr vs ultra_vpr) — per `components/02_c2_vpr/description.md` § 6 and the task spec § Constraints. Each preprocessor owns its own input-shape constants so a future code drop can change a backbone's preprocessing without coupling other strategies' weights-versions.
+3. **`_assert_engine_output_dim` duplicated, NOT extracted** — see Spec Drift / Review Finding F1 below. The cleaner path is a dedicated AZ-527 hygiene PBI mirroring AZ-508 → AZ-526.
+4. **`iso_ts_from_clock` imported from the AZ-526 helper from day 1** — neither new strategy introduces a local `_iso_ts_from_clock` body. The AZ-526 regression guard test confirms this.
+5. **Runtime-label guard placed inside `create()`** (not in `__init__`) — runtime selection is a composition-time concern; once the strategy is constructed it's expected to work. Matches the UltraVPR / NetVLAD precedent.
+
+## Spec drift noted (carried into review F2)
+
+AZ-339 § AC-10 literally specifies `ConfigurationError` for the build-flag-OFF case. The existing AZ-336 composition-root factory raises `StrategyNotAvailableError` for this case (per its own contract and test coverage at `test_protocol_conformance.py:268-274`). The strategy module's own runtime-label guard raises `ConfigError` for the related "wrong C7 runtime" case. AZ-337 (UltraVPR) and AZ-338 (NetVLAD) followed this same pattern; AZ-339 mirrors them. AC-10 wording should be amended in a future spec pass; no code change required.
+
+## Cumulative review obligation
+
+This batch is mid-window (batch 50, next cumulative review at batch 51 / batches 49-51). The new finding F1 (`_assert_engine_output_dim` 4-way duplication) will surface in that cumulative review, and AZ-527 (the planned hygiene PBI) will close it. The AZ-526 regression guard test confirmed that neither AZ-526's F1+F3 closure regressed in AZ-339.
+
+## Follow-on PBI
+
+**AZ-527** (Hygiene — consolidate `_assert_engine_output_dim` into a c2-internal helper). 2 points. Depends on AZ-339. To be created and prioritised as Batch 51 or 52.
@@ -0,0 +1,104 @@
+# Code Review Report
+
+**Batch**: 50 — AZ-339 (C2 MegaLoc + MixVPR Secondary Backbones)
+**Date**: 2026-05-13
+**Verdict**: PASS_WITH_WARNINGS
+
+## Findings
+
+| # | Severity | Category | File:Line | Title |
+|---|----------|----------|-----------|-------|
+| 1 | Medium | Maintainability/Architecture | `c2_vpr/mega_loc.py:438`, `mix_vpr.py:432`, `ultra_vpr.py:432`, `net_vlad.py:494` | `_assert_engine_output_dim` now 4-way duplicated — schedule AZ-527 |
+| 2 | Low | Scope | AZ-339 task spec § AC-10 | AC-10 names `ConfigurationError`; precedent + impl raise `StrategyNotAvailableError` / `ConfigError` |
+
+### Finding Details
+
+**F1: `_assert_engine_output_dim` now 4-way duplicated** (Medium / Maintainability + Architecture)
+
+- Locations:
+  - `src/gps_denied_onboard/components/c2_vpr/ultra_vpr.py:432`
+  - `src/gps_denied_onboard/components/c2_vpr/net_vlad.py:494`
+  - `src/gps_denied_onboard/components/c2_vpr/mega_loc.py:438`
+  - `src/gps_denied_onboard/components/c2_vpr/mix_vpr.py:432`
+- Description: Each strategy module ships a near-identical ~22-line `_assert_engine_output_dim(inference_runtime, handle, preprocessor)` helper. Bodies vary only on three values: `_OUTPUT_KEY` (always `"embedding"` for mega_loc / mix_vpr / ultra_vpr; `"vlad_descriptor"` for net_vlad), `DESCRIPTOR_DIM` (per-strategy constant), and `preprocessor.input_shape()`. Same drift signature as AZ-508 → AZ-526 (`_iso_ts_now` / `_iso_ts_from_clock`).
+- The cumulative review (batches 46-48) flagged this duplication as F2 and recommended deferring "until a third VPR strategy joins (AZ-339 batch)". That trigger has fired.
+- Suggestion: Create AZ-527 (Hygiene — consolidate `_assert_engine_output_dim` into a c2-internal helper). Signature: `_assert_engine_output_dim(inference_runtime, handle, *, expected_dim, output_key, input_shape)`. 2 points; depends on AZ-339.
+- Inline comments in the new mega_loc.py and mix_vpr.py already cite AZ-527 as the planned follow-on so the duplication is intentional, not accidental.
+- Task: AZ-339 (carries forward from cumulative-46-48 F2).
+
+**F2: AC-10 names `ConfigurationError` but precedent / implementation raise `StrategyNotAvailableError` / `ConfigError`** (Low / Scope)
+
+- Location: `_docs/02_tasks/todo/AZ-339_c2_megaloc_mixvpr.md` § AC-10.
+- Description: AC-10 literally reads "`ConfigurationError` is raised at composition-root time with message containing the missing flag; the binary refuses to start (fail-fast per AZ-336 factory's lazy-import → ImportError → `ConfigurationError` mapping)". The existing AZ-336 factory (`build_vpr_strategy`) raises **`StrategyNotAvailableError`** for the `BUILD_VPR_<X>=OFF` case (verified via `tests/unit/c2_vpr/test_protocol_conformance.py:268-274` for UltraVPR; same pattern auto-extends to MegaLoc / MixVPR via the parametrized `_STRATEGY_MODULES` table). `StrategyNotAvailableError` is a `RuntimeError` subclass, NOT a `ConfigError`. AZ-337 / AZ-338 followed this precedent; AZ-339 does the same. The strategy module's own runtime-label guard raises `ConfigError` (the "wrong C7 runtime label" case), which satisfies AC-10's *spirit* of "composition-time fail-fast".
+- Implementation choice: mirrored the existing precedent.
+- Suggestion: amend AC-10 to read "`StrategyNotAvailableError` (for BUILD flag OFF) or `ConfigError` (for runtime-label mismatch) at composition-root time, with a message naming the missing flag or runtime". Recorded as drift; no code change required.
+- Task: AZ-339.
+
+## Phase Summary
+
+### Phase 1 — Context Loading
+Read AZ-339 task spec (208 lines, AC-1..AC-11 per strategy + NFRs), the AZ-337 UltraVPR + AZ-338 NetVLAD precedents (`ultra_vpr.py`, `net_vlad.py`, `_preprocessor_ultra_vpr.py`), the AZ-336 composition-root factory (`vpr_factory.py`), the AZ-336 C2VprConfig + KNOWN_STRATEGIES, and the `cumulative_review_batches_46-48_cycle1_report.md` F2 finding. Mapped 5 new files (2 strategy, 2 preprocessor, 1 test) to AZ-339.
+
+### Phase 2 — Spec Compliance
+
+| AC | Verified by | Status |
+|----|-------------|--------|
+| AC-1 (Protocol conformance) | `test_ac1_protocol_conformance[mega_loc]`, `[mix_vpr]` | PASS |
+| AC-2 (L2-norm FP16 correct dim) | `test_ac2_embed_query_returns_unit_norm_fp16_correct_dim[*]`, `test_ac2_single_stage_l2_no_intra_cluster_call[*]` | PASS |
+| AC-3 (deterministic embeddings) | `test_ac3_embed_query_deterministic_for_same_frame[*]` | PASS |
+| AC-4 (retrieve_topk k + label) | `test_ac4_retrieve_topk_returns_exactly_k_with_correct_label[*]` | PASS |
+| AC-5 (descriptor_dim stable) | `test_ac5_descriptor_dim_stable[*]` | PASS |
+| AC-6 (engine shape mismatch → ConfigError at create) | `test_ac6_create_rejects_engine_output_shape_mismatch[*]`, `test_ac6_create_rejects_missing_embedding_key[*]` | PASS |
+| AC-7 (VprBackboneError on forward failure) | `test_ac7_runtime_error_yields_vpr_backbone_error[*]`, `test_ac7_wrong_forward_output_shape_yields_vpr_backbone_error[*]` | PASS |
+| AC-8 (VprPreprocessError on corrupt image) | `test_ac8_corrupt_image_yields_vpr_preprocess_error[*]` | PASS |
+| AC-9 (compose wiring + INFO ready log) | `test_ac9_create_emits_ready_log_with_correct_label_and_dim[*]` | PASS |
+| AC-10 (build-flag exclusion fail-fast) | `test_ac10_runtime_label_mismatch_raises_config_error[*]` + `tests/unit/c2_vpr/test_protocol_conformance.py` parametrised over `_STRATEGY_MODULES` (auto-covers mega_loc + mix_vpr) | PASS with F2 wording drift |
+| AC-11 (preprocessor input shape) | `test_ac11_preprocessor_input_shape[*]`, `test_preprocess_output_is_nchw_fp16[*]` | PASS |
+
+36 / 36 tests in `test_az339_mega_loc_mix_vpr.py` pass; 25 / 25 in `test_protocol_conformance.py` pass (now auto-covering the two new strategies via the existing parametrised module-import table).
+
+### Phase 3 — Code Quality
+
+- **SRP**: Strategy class = embed + retrieve via injected dependencies. Preprocessor class = decode + crop + resize + normalise. Each error handler is a separate helper method. Factory `create()` is wiring-only.
+- **Error handling**: Every failure path emits a structured ERROR log AND an FDR record before raising. Errors are explicitly re-raised; no swallowed exceptions.
+- **Naming**: Consistent with the UltraVPR precedent — `_BACKBONE_LABEL`, `_OUTPUT_KEY`, `_LOG_KIND_*`, `_FDR_KIND_*`, `_assert_engine_output_dim`. `DESCRIPTOR_DIM` is module-level Final per strategy (2048 / 4096), matching the AZ-337 / AZ-338 pattern.
+- **Complexity**: Strategy class ~310 lines (incl. error handlers); `embed_query` ~55 lines (within the 50-line guidance; same shape as UltraVPR). Cyclomatic complexity low.
+- **DRY**: Strategy-pair duplication (mega_loc vs mix_vpr) is **intentional** per the task spec § Constraints: "Each strategy ships its own concrete preprocessor — preprocessing parameters per upstream code drop … sharing would couple weights-versions across strategies and let one strategy's upgrade silently break another's preprocessing." `_assert_engine_output_dim` duplication is unintentional — see F1.
+- **Test quality**: AAA pattern with explicit markers. Parametrised across `_StrategySpec` to keep cross-strategy assertions semantically identical. Each AC has at least one parametrised test plus targeted variants for failure modes.
+- **Dead code**: None introduced. `Literal` import in strategy modules is used by `_BACKBONE_LABEL: Final[Literal["mega_loc"]]` / `["mix_vpr"]` annotations.
+
+### Phase 4 — Security Quick-Scan
+
+- No SQL, no shell, no `eval` / `exec`, no dynamic deserialisation.
+- `cv2.resize` is the only third-party call; inputs are typed `np.ndarray` and validated for dtype / ndim / shape upstream.
+- `error_message[:512]` truncation prevents pathological log-line / FDR-payload growth on a long backbone error.
+- No hardcoded secrets, API keys, or paths beyond test-fixture placeholders (`/models/mega_loc.trt`, `/cache/vpr/index.faiss`).
+- Image inputs are byte-bounded (`uint8` only); rejection paths emit `VprPreprocessError` not raw `cv2.error`.
+
+### Phase 5 — Performance Scan
+
+- Construction is O(1) (no GPU ops in `__init__` per the task spec § Constraints).
+- `embed_query` is O(H·W) for decode / resize / normalise — same algorithmic cost as UltraVPR. The 2048-d / 4096-d FP16 embedding is allocated once per frame.
+- No N+1 patterns, no unbounded fetching.
+- One FDR-record allocation per frame on the success path — same per-frame allocation cost as UltraVPR; sits well below the bounded `capacity` of the FdrClient ring.
+- NFR-perf budgets (MegaLoc ≤ 80 ms p95, MixVPR ≤ 100 ms p95) are research-side guidance per the task spec § NFR; not engine-rule-binding. Cannot be measured in unit tests; deferred to Step 9 / E-BBT against the real engines per the task spec § Risks 1 + 4.
+
+### Phase 6 — Cross-Task Consistency
+
+Single-batch with two strategies — they were implemented in lockstep, share the same factory `create()` shape, the same error envelope, the same FDR record kinds (`vpr.embed_query`, `vpr.backbone_error`, `vpr.preprocess_error`), and the same log kinds. The parametrised test surface verifies behavioural equivalence directly.
+
+### Phase 7 — Architecture Compliance
+
+- **Layer direction**: c2_vpr modules import from `_types` (L1), `clock` (L1), `helpers.descriptor_normaliser` (L1), `helpers.iso_timestamps` (L1), `config.schema` (L1), `fdr_client` (L2), and internal `c2_vpr` modules. No upward imports. **PASS.**
+- **Public API respect**: The strategies do NOT import from `c6_tile_cache` or `c7_inference` directly — they use the consumer-side cuts (`DescriptorIndexCut`, `InferenceRuntimeCut`) defined locally in c2_vpr per AZ-507. **PASS.**
+- **No new cyclic dependencies**: New modules sit in c2_vpr leaf positions; no incoming imports from c2_5 / c3 / runtime_root that didn't already exist for ultra_vpr / net_vlad. **PASS.**
+- **Duplicate symbols**: F1 (above) — `_assert_engine_output_dim` is the only new duplication. Strategy class names are unique (`MegaLocStrategy`, `MixVprStrategy`). Preprocessor class names are unique. Constants (`DESCRIPTOR_DIM`, `_BACKBONE_LABEL`) are module-scoped and intentionally per-strategy.
+- **Cross-cutting concerns not locally re-implemented**: The new strategies import `iso_ts_from_clock` from `gps_denied_onboard.helpers.iso_timestamps` — they do NOT re-introduce a local `_iso_ts_from_clock` body (verified by `test_ac4_az526_no_module_level_iso_ts_from_clock_outside_helper` continuing to pass post-AZ-339). **PASS.** AZ-526's regression guard worked exactly as designed.
+
+## Pre-existing failure noted (not blocking)
+
+`tests/unit/c12_operator_orchestrator/test_cli_console_script.py::TestConsoleScript::test_cold_start_under_500ms_p99` — fails on this dev laptop with a `subprocess.TimeoutExpired` after 5 seconds when running `operator-orchestrator --help`. Confirmed pre-existing by stashing the AZ-339 changes, running the test against the prior commit `5dfd9a5` (AZ-526), and observing the same failure. Cold-start latency depends on local Python interpreter startup + import time and is unrelated to this batch. Not blocking; logged here for traceability.
+
+## Verdict Rationale
+
+One Medium finding (F1: `_assert_engine_output_dim` 4-way duplication, planned for AZ-527) and one Low finding (F2: AC-10 wording drift, mirroring established AZ-337 / AZ-338 precedent). No Critical, no High. Verdict: **PASS_WITH_WARNINGS**.