Closes out greenfield Step 6 (Decompose) for all 14 components (C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446 plus the _dependencies_table.md and component contract documents. State file updated to greenfield Step 7 (Implement), not_started. Co-authored-by: Cursor <cursoragent@cursor.com>
22 KiB
C2 UltraVPR Primary Backbone
Task: AZ-337_c2_ultra_vpr
Name: C2 UltraVPR Primary Backbone (TRT)
Description: Implement UltraVprStrategy, the production-default VprStrategy (per ADR-001 default selection). UltraVPR is the Documentary Lead's PRIMARY backbone selected at config time and ON in airborne / research / replay-cli binaries (per ADR-002 build-time exclusion map). Wraps the upstream UltraVPR research code drop, exposes its forward pass via the C7 InferenceRuntime (TensorRT 10.3 primary, ONNX-Runtime fallback), and produces L2-normalised float16 embeddings (D=512 typical) for FAISS HNSW retrieval. Includes the concrete UltraVprBackbonePreprocessor (resize / centre-crop / mean-std normalise per UltraVPR's input contract). The strategy MUST satisfy the AC-2.1b recall@10 ≥ 0.95 floor on the Derkachi normal segment and the C2-PT-01 latency budget (embed_query p95 ≤ 60 ms).
Complexity: 5 points
Dependencies: AZ-336_c2_vpr_strategy_protocol, AZ-263_initial_structure, AZ-269_config_loader, AZ-298_c7_tensorrt_runtime, AZ-303_c6_storage_interfaces, AZ-283_descriptor_normaliser, AZ-281_engine_filename_schema, AZ-321_c10_engine_compiler (engine compile path; UltraVPR engine is one of the engines C10 builds), AZ-266_log_module, AZ-272_fdr_record_schema
Component: c2_vpr (epic AZ-255 / E-C2)
Tracker: AZ-337
Epic: AZ-255 (E-C2)
Document Dependencies
_docs/02_document/contracts/c2_vpr/vpr_strategy_protocol.md— Protocol contract this task implements (every invariant MUST be satisfied)._docs/02_document/components/02_c2_vpr/description.md— § 1 PRIMARY backbone designation; § 2 interface; § 5 backbone weights ≤ 600 MB GPU; § 7 GPU stream race notes; § 9 logging._docs/02_document/module-layout.md—c2_vprPer-Component Mapping (ultra_vpr.pyInternal);BUILD_VPR_ULTRA_VPRrow in build-time exclusion map (ON for airborne/research/replay-cli, OFF for operator-tooling)._docs/02_document/contracts/c7_inference/inference_runtime_protocol.md—InferenceRuntimeinterface (engine load, forward pass, output extraction)._docs/02_document/contracts/shared_helpers/descriptor_normaliser.md— L2 normalisation contract (UltraVPR raw embeddings are NOT L2-normalised; this task MUST normalise)._docs/02_document/contracts/shared_helpers/engine_filename_schema.md— TensorRT engine filename → metadata extraction._docs/02_document/components/02_c2_vpr/tests.md— C2-IT-01 (recall@10 ≥ 0.95 on Derkachi); C2-IT-02 (VprResultinvariants); C2-PT-01 (embed_queryp95 ≤ 60 ms; combined ≤ 65 ms; ≤ 600 MB GPU; ≤ 200 MB sys mem).
Problem
UltraVPR is the production-default backbone (description.md § 1 "Documentary Lead PRIMARY backbone"). Without this task:
- The composition root has no concrete strategy to wire when
config.vpr.strategy = "ultra_vpr"(the default value); the airborne binary cannot start. - AC-2.1b (recall@10 ≥ 0.95 on Derkachi) — the highest-priority C2 acceptance criterion — has no producer; the suite-level FT-P-19 satellite re-loc test cannot pass.
- AC-4.1 latency budget for VPR is allocated against UltraVPR specifically (60 ms
embed_query); without the TRT-backed implementation, the budget is unconsumable and the E2E latency target (400 ms p95) cannot be validated. - The 600 MB GPU memory ceiling for backbone weights is enforced at the implementation layer; without it, no operator can validate the airborne deployment fits the Tier-1 Jetson Orin's GPU memory budget.
- UltraVPR has a non-trivial input preprocessing contract (specific resize target, centre-crop, ImageNet mean/std normalisation, FP16 cast); without
UltraVprBackbonePreprocessor, every consumer would re-derive the contract → silent recall regression.
Outcome
src/gps_denied_onboard/components/c2_vpr/ultra_vpr.pydefining:UltraVprStrategyclass implementing theVprStrategyProtocol (AZ-336).- Constructor signature:
__init__(self, runtime: InferenceRuntime, tile_store: TileStore, weights_path: Path, preprocessor: UltraVprBackbonePreprocessor, normaliser: DescriptorNormaliser, fdr_client: FdrClient). embed_query(frame, calibration):tensor = self._preprocessor.preprocess(frame, calibration)(returns FP16 NCHW (1, 3, H, W)).raw = self._runtime.forward(self._engine_id, {"input": tensor})["embedding"](returns FP16 (1, 512)).embedding = self._normaliser.l2_normalise(raw[0])(returns FP16 (512,) with||embedding||_2 == 1.0 ± 1e-3).- Return
VprQuery(frame_id, embedding, produced_at=monotonic_ns()). - Catch RuntimeError / CudaError → wrap in
VprBackboneError; emit ERROR log + FDR recordkind="vpr.backbone_error".
retrieve_topk(query, k):distances, tile_ids = self._tile_store.faiss_topk(query.embedding, k)(delegates to C6 TileStore Public API).- Build
[VprCandidate(tile_id, distance, descriptor_dim=512) for ...]. - Return
VprResult(query.frame_id, candidates, retrieved_at=monotonic_ns(), backbone_label="ultra_vpr"). - On
IndexUnavailableError(raised by C6 TileStore on stale handle), re-raise unchanged.
descriptor_dim() -> int: returns 512 (the UltraVPR research code drop's published embedding dim; the value is asserted at engine-load time against the engine's output tensor shape; mismatch →RuntimeErrorat startup).- Module-level
create(config, tile_store, inference_runtime) -> VprStrategy:- Resolve
weights_path = config.vpr.backbone_weights_path(a TensorRT engine file produced by C10's engine compiler — AZ-321 — with the AZ-281 self-describing filename schema). - Construct
UltraVprBackbonePreprocessor(input_shape=(384, 384), mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))(parameters from the upstream UltraVPR config, hard-coded here per CONST.SRP — these are weights-coupled, not config-knobs). - Construct
DescriptorNormaliser(or fetch from helpers; AZ-283). - Load engine via
inference_runtime.load_engine(weights_path)— the engine ID is captured for laterforwardcalls. - Assert engine output shape ==
(1, 512)FP16; mismatch →ConfigurationError. - Construct and return
UltraVprStrategy(...).
- Resolve
src/gps_denied_onboard/components/c2_vpr/_preprocessor_ultra_vpr.py(or_preprocessor.pyshared scaffolding + concreteUltraVprBackbonePreprocessor):- Implements
BackbonePreprocessorProtocol from AZ-336. preprocess(frame, calibration):- Decode
frame.image_bytesto RGB uint8 ndarray (H_in, W_in, 3) via OpenCV / Pillow. - Centre-crop to a square region of side
min(H_in, W_in)using calibration's principal point if non-centre (otherwise geometric centre). Calibration is consumed here for principal-point alignment per the upstream UltraVPR contract; if calibration is absent, fall back to geometric centre with a WARN log. - Resize to
(384, 384)via OpenCVINTER_AREAfor downscale,INTER_CUBICfor upscale. - Normalise:
(pixel/255.0 - mean) / stdper channel; cast to FP16. - Transpose HWC → CHW; add batch dim → NCHW.
- Return ndarray of shape
(1, 3, 384, 384)dtype float16.
- Decode
input_shape() -> tuple[int, ...]: returns(384, 384).- On any preprocessing failure (corrupt image bytes, calibration mismatch), raise
VprPreprocessErrorand emit ERROR log + FDR recordkind="vpr.preprocess_error".
- Implements
- Composition-root wiring:
runtime_root.compose_rootincludes a path that, whenconfig.vpr.strategy == "ultra_vpr", callsUltraVprStrategy.create(config, tile_store, inference_runtime)via the AZ-336 factory. - Logging per description.md § 9:
- INFO
kind="c2.vpr.ready"with{strategy: "ultra_vpr", descriptor_dim: 512, corpus_size: <N>}after engine load. - WARN
kind="c2.vpr.top1_distance_above_threshold"if top-1 distance >config.vpr.warn_top1_threshold(default 0.30). - ERROR
kind="c2.vpr.backbone_error"andkind="c2.vpr.preprocess_error"per error path. - DEBUG
kind="c2.vpr.frame_distances"with top-K distances per frame (gated by config; off by default to avoid log volume at 3 Hz).
- INFO
- FDR records emitted:
kind="vpr.embed_query"(per frame, with frame_id + backbone_label + bbox of distances),kind="vpr.backbone_error"andkind="vpr.preprocess_error"(per error).
Scope
Included
UltraVprStrategyclass implementing theVprStrategyProtocol exactly per the AZ-336 contract.UltraVprBackbonePreprocessorimplementingBackbonePreprocessorProtocol with the upstream UltraVPR's published preprocessing parameters.- Module-level
create(config, tile_store, inference_runtime)factory entry-point. - Engine-output-shape assertion at load time (
(1, 512)FP16); mismatch →ConfigurationError. - L2-normalisation of every embedding via the AZ-283
DescriptorNormaliserhelper. - Composition-root wiring path for
config.vpr.strategy == "ultra_vpr". - Logging per description.md § 9 (INFO ready, WARN top-1-above-threshold, ERROR error paths, DEBUG per-frame distances).
- FDR record emission for embed-query and error paths.
- Unit tests covering all 7 invariants (INV-1..INV-7), the engine-output-shape assertion, the preprocessing contract, the L2-normalisation post-condition, the composition-root wiring path.
BUILD_VPR_ULTRA_VPRCMake flag wiring (per ADR-002): the strategy module is excluded from the operator-tooling binary.
Excluded
- The
VprStrategyProtocol +BackbonePreprocessorProtocol + DTOs + errors + factory — owned by AZ-336. - The
DescriptorNormaliserhelper — already AZ-283. - The C7
InferenceRuntime(engine load + forward pass) — owned by AZ-298 (TensorRT runtime). - The C6
TileStore.faiss_topkquery — owned by AZ-303 / AZ-306; this task consumes the Public API. - Engine compile (
.onnx→.trt) — owned by AZ-321 (c10_engine_compiler); this task consumes the produced.trtengine viaconfig.vpr.backbone_weights_path. - Other backbones — AZ-338 (NetVLAD), AZ-339 (MegaLoc + MixVPR), AZ-340 (SelaVPR + EigenPlaces + SALAD).
- FAISS HNSW wiring at the strategy level —
retrieve_topkdelegates totile_store.faiss_topk; the FAISS index lifecycle (mmap, sidecar verify, handle invalidation) is owned by AZ-341. - Component-internal tests beyond Protocol + invariants + preprocessing-contract: C2-IT-01 (recall@10 acceptance test), C2-IT-03 (poisoned-tile), C2-IT-04 (scale-ratio), C2-PT-01 (latency NFR), C2-ST-01 (stale handle) are deferred to Step 9 / E-BBT.
Acceptance Criteria
AC-1: Protocol conformance
Given a constructed UltraVprStrategy instance
When isinstance(strategy, VprStrategy) is evaluated
Then the result is True; the instance has embed_query, retrieve_topk, descriptor_dim
AC-2: embed_query produces L2-normalised FP16 (512,) embedding
Given a valid NavCameraFrame and CameraCalibration
When strategy.embed_query(frame, calibration) is called
Then a VprQuery is returned with embedding.shape == (512,), embedding.dtype == np.float16, ||embedding||_2 == 1.0 ± 1e-3
AC-3: embed_query is deterministic (INV-2 + INV-6)
Given the same frame + calibration
When embed_query is called 3 times
Then all three returns have bit-exact embedding arrays (ULP-tolerant for FP16); frame_id and produced_at differ across calls but embedding does not
AC-4: retrieve_topk returns exactly k candidates sorted ascending
Given a corpus of 100 tiles loaded into C6 TileStore + a constructed VprQuery
When strategy.retrieve_topk(query, k=10) is called
Then len(candidates) == 10; [c.descriptor_distance for c in candidates] is non-strictly-ascending; backbone_label == "ultra_vpr"; candidates[0].descriptor_dim == 512
AC-5: descriptor_dim() is stable and returns 512
Given a constructed UltraVprStrategy
When descriptor_dim() is called 100 times
Then every call returns 512
AC-6: Engine output shape mismatch at load → ConfigurationError
Given a TRT engine whose output tensor shape is (1, 256) (not 512)
When UltraVprStrategy.create(config, tile_store, inference_runtime) is called
Then ConfigurationError is raised with message containing "engine output shape mismatch: expected (1, 512), got (1, 256)"; the strategy is NOT instantiated
AC-7: VprBackboneError on forward-pass failure
Given an InferenceRuntime test double that raises RuntimeError from forward
When strategy.embed_query(frame, calibration) is called
Then VprBackboneError is raised; ONE ERROR log kind="c2.vpr.backbone_error" is emitted; ONE FDR record kind="vpr.backbone_error" is emitted
AC-8: VprPreprocessError on corrupt image bytes
Given a NavCameraFrame with malformed image_bytes (not decodable)
When strategy.embed_query(frame, calibration) is called
Then VprPreprocessError is raised; ONE ERROR log kind="c2.vpr.preprocess_error" is emitted; ONE FDR record kind="vpr.preprocess_error" is emitted
AC-9: Calibration absent → centre-crop falls back to geometric centre + WARN log
Given a frame with calibration = None (or calibration.principal_point absent)
When embed_query(frame, calibration) is called
Then preprocessing succeeds with geometric-centre crop; ONE WARN log kind="c2.vpr.calibration_missing" is emitted; the embedding is L2-normalised (AC-2 still holds)
AC-10: IndexUnavailableError propagated unchanged from retrieve_topk
Given a C6 TileStore test double that raises IndexUnavailableError from faiss_topk
When strategy.retrieve_topk(query, k=10) is called
Then IndexUnavailableError is raised unchanged (NOT wrapped); no candidates returned
AC-11: Composition-root wiring — config.vpr.strategy = "ultra_vpr"
Given config.vpr.strategy = "ultra_vpr" AND a valid weights_path AND matching descriptor_dim in C6 sidecar
When compose_root(config) runs
Then a UltraVprStrategy instance is wired into the runtime root; the AZ-336 factory's pre-flight descriptor_dim validation passes; ONE INFO log kind="c2.vpr.ready" with {strategy: "ultra_vpr", descriptor_dim: 512, corpus_size: <N>} is emitted
AC-12: WARN log on top-1 distance above threshold
Given config.vpr.warn_top1_threshold = 0.30 AND a VprResult whose top-1 descriptor_distance = 0.42
When retrieve_topk returns
Then ONE WARN log kind="c2.vpr.top1_distance_above_threshold" with structured field {distance: 0.42, threshold: 0.30} is emitted
Non-Functional Requirements
Performance (deferred validation to C2-PT-01 / E-BBT; this task delivers the implementation):
embed_queryp95 ≤ 60 ms on Tier-1 Jetson Orin with TensorRT 10.3 FP16 — bounded by the TRT engine forward-pass time + preprocessing overhead. The preprocessing path itself MUST be ≤ 5 ms p95 (so the TRT call has ~55 ms budget).retrieve_topkp95 ≤ 2 ms — bounded by C6 FAISS HNSW; this task contributes only the Python wrapping overhead.- GPU memory: ≤ 600 MB resident for backbone weights (FP16 engine ~ 100-150 MB; remainder is workspace).
- System memory: ≤ 200 MB for the mmap'd FAISS index handle (C6 owns this; this task consumes).
Compatibility
- The TRT engine file format is owned by C10 / C7; this task consumes the produced
.trtengine viaconfig.vpr.backbone_weights_path. Engine version mismatches surface via the AZ-281 self-describing filename schema; the C7load_engineenforces compatibility. - The upstream UltraVPR research code drop is pinned per Plan-phase; weight-format changes between drops would require a new engine build (C10) and a re-run of C2-IT-01 to confirm recall@10 still passes.
Reliability
- Strategy is single-threaded by contract (INV-1, AZ-336); composition root binds to one ingest thread.
- L2-normalisation is unconditional (INV-3); raw UltraVPR embeddings are not L2-normalised by the upstream forward pass.
VprBackboneErrordoes not crash the process; downstream C5 falls back to VIO-only with provenancevisual_propagated(AC-1.4).
Unit Tests
| AC Ref | What to Test | Required Outcome |
|---|---|---|
| AC-1 | isinstance(UltraVprStrategy(...), VprStrategy) |
True |
| AC-2 | embed_query output |
shape (512,), dtype float16, L2-norm == 1.0 ± 1e-3 |
| AC-3 | embed_query × 3 same frame |
bit-exact embeddings (ULP-tolerant FP16) |
| AC-4 | retrieve_topk against fixture corpus |
len == 10, sorted ascending, backbone_label == "ultra_vpr", descriptor_dim == 512 |
| AC-5 | descriptor_dim() × 100 |
always 512 |
| AC-6 | TRT engine with wrong output shape | ConfigurationError at create time |
| AC-7 | InferenceRuntime.forward raises |
VprBackboneError; ERROR log + FDR record |
| AC-8 | malformed image_bytes |
VprPreprocessError; ERROR log + FDR record |
| AC-9 | calibration = None |
preprocessing succeeds with geometric centre; WARN log |
| AC-10 | tile_store.faiss_topk raises IndexUnavailableError |
propagated unchanged |
| AC-11 | compose_root(config="ultra_vpr") |
wired; INFO log with {strategy, descriptor_dim, corpus_size} |
| AC-12 | top-1 distance > threshold | WARN log emitted |
| Preprocess-shape | preprocessor.preprocess(frame) output |
shape (1, 3, 384, 384), dtype float16 |
| Preprocess-mean-std | preprocessing on a uniform-grey image | per-channel (grey - mean) / std matches expected to ULP |
| Preprocess-input-shape | preprocessor.input_shape() |
returns (384, 384) |
Constraints
- The
BackbonePreprocessorinstance for UltraVPR lives next to the strategy, NOT inhelpers/— preprocessing parameters are weights-coupled (description.md § 6 "C2-internal helper, NOT a shared helper"). - Preprocessing parameters are hard-coded —
(384, 384)resize target,(0.485, 0.456, 0.406)ImageNet mean,(0.229, 0.224, 0.225)ImageNet std. These are weights-coupled per the upstream UltraVPR contract; making them config-knobs would let an operator silently break the AC-2.1b recall floor. - L2-normalisation is mandatory even though some downstream code paths are robust to non-normalised embeddings — INV-3 from the contract is non-negotiable.
- Engine load happens at
createtime, NOT at first frame — the engine-output-shape assertion (AC-6) MUST fire at startup. - The strategy holds the engine ID returned by
inference_runtime.load_engine, NOT the engine itself — engine lifecycle is owned by C7. - Constructor injection only — no
import gps_denied_onboard.configinside the strategy module; config is consumed via thecreatefactory. - No GPU operations outside
embed_query—__init__does the engine load (one-time cost),embed_querydoes the per-frame forward pass; nothing else touches the GPU stream.
Risks & Mitigation
Risk 1: UltraVPR upstream code drop ships an unsupported ONNX op
- Risk: The TRT 10.3 ONNX importer doesn't support a custom op in UltraVPR's graph; engine compilation fails at C10 stage.
- Mitigation: Engine compile is C10's responsibility (AZ-321). This task consumes the produced engine and assumes it's loadable. If C10 cannot build the engine, the strategy cannot be wired — a hard upstream blocker that surfaces during AZ-321 implementation, NOT here.
Risk 2: FP16 precision insufficient for AC-2.1b recall@10 ≥ 0.95
- Risk: FP16 quantisation degrades embedding fidelity below the recall floor on the Derkachi corpus.
- Mitigation: C2-IT-01 (deferred to Step 9) is the validation gate. If FP16 fails, the operator can fall back to FP32 by rebuilding the engine via C10 with
precision=fp32— this is a config-time decision, NOT a code change in this task. The strategy treats FP16 vs FP32 as transparent (the engine output dtype is asserted at load time; embedding dtype follows the engine).
Risk 3: Centre-crop with calibration's principal point introduces non-determinism if calibration changes mid-flight
- Risk: An operator hot-swaps calibration during flight; embeddings shift; recall drops silently.
- Mitigation: Calibration changes mid-flight are forbidden by the broader F1 / F2 / F3 lifecycle (calibration is loaded once per flight at takeoff). If a future cycle adds hot-swap support, a separate task adds calibration-versioning to embeddings.
Risk 4: Per-frame DEBUG log volume at 3 Hz × 10 distances = 30 entries/sec
- Risk: Default-on DEBUG logging floods journald.
- Mitigation: DEBUG
kind="c2.vpr.frame_distances"is gated byconfig.vpr.debug_per_frame_distances(defaultfalse); operators enable it only for forensic investigation of a specific flight.
Risk 5: WARN-threshold default (0.30) needs calibration
- Risk: The 0.30 default threshold for top-1 distance WARN is a placeholder; production-tuned values come from FT-P-19 telemetry.
- Mitigation:
config.vpr.warn_top1_thresholdis config-driven (default 0.30); a follow-up cycle will tune from real flight FDR data. The default is a conservative starting point that surfaces obvious false-positives without flooding logs.
Runtime Completeness
- Named capability: production-default
VprStrategyfor top-K retrieval against the C6 FAISS corpus (architecture / E-C2 /solution.md"UltraVPR primary backbone" / AC-2.1b + AC-4.1). - Production code that must exist: real
UltraVprStrategycalling real C7InferenceRuntime.forwardwith a real TRT-compiled UltraVPR engine; realUltraVprBackbonePreprocessorperforming real OpenCV resize + ImageNet normalisation + FP16 cast; real L2-normalisation via realDescriptorNormaliser; real composition-root wiring inruntime_root.compose_rootfor theultra_vprstrategy choice. - Allowed external stubs: tests MAY use
FakeInferenceRuntimereturning pre-computed embeddings (AC-2..AC-7),FakeTileStore(AC-4 / AC-10 / AC-11),FakeFdrClient(verifying FDR record emission), a synthetic frame fixture for preprocessing tests; production wiring uses the real C7 + C6 + UltraVPR engine. - Unacceptable substitutes: a Python-only NumPy implementation of UltraVPR's forward pass (would not satisfy C2-PT-01 latency at 60 ms p95; would defeat the GPU-bound architectural choice); skipping L2-normalisation (would break INV-3 and downstream cosine-similarity assumptions); making preprocessing parameters config-knobs (would let operators silently break AC-2.1b); engine load at first frame instead of
createtime (would defer the engine-output-shape assertion past startup, defeating fail-fast); per-strategy thread safety (the contract is single-thread; adding locks would mask the composition-root binding bug if it ever broke); a "demo mode" that returns dummy embeddings to bypass the TRT engine.