Files
gps-denied-onboard/_docs/02_tasks/todo/AZ-337_c2_ultra_vpr.md
T
Oleksandr Bezdieniezhnykh 880eabcb3f Decompose Step 6 snapshot: 140 task specs + contract docs
Closes out greenfield Step 6 (Decompose) for all 14 components
(C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446
plus the _dependencies_table.md and component contract documents.

State file updated to greenfield Step 7 (Implement), not_started.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:39:48 +03:00

22 KiB
Raw Blame History

C2 UltraVPR Primary Backbone

Task: AZ-337_c2_ultra_vpr Name: C2 UltraVPR Primary Backbone (TRT) Description: Implement UltraVprStrategy, the production-default VprStrategy (per ADR-001 default selection). UltraVPR is the Documentary Lead's PRIMARY backbone selected at config time and ON in airborne / research / replay-cli binaries (per ADR-002 build-time exclusion map). Wraps the upstream UltraVPR research code drop, exposes its forward pass via the C7 InferenceRuntime (TensorRT 10.3 primary, ONNX-Runtime fallback), and produces L2-normalised float16 embeddings (D=512 typical) for FAISS HNSW retrieval. Includes the concrete UltraVprBackbonePreprocessor (resize / centre-crop / mean-std normalise per UltraVPR's input contract). The strategy MUST satisfy the AC-2.1b recall@10 ≥ 0.95 floor on the Derkachi normal segment and the C2-PT-01 latency budget (embed_query p95 ≤ 60 ms). Complexity: 5 points Dependencies: AZ-336_c2_vpr_strategy_protocol, AZ-263_initial_structure, AZ-269_config_loader, AZ-298_c7_tensorrt_runtime, AZ-303_c6_storage_interfaces, AZ-283_descriptor_normaliser, AZ-281_engine_filename_schema, AZ-321_c10_engine_compiler (engine compile path; UltraVPR engine is one of the engines C10 builds), AZ-266_log_module, AZ-272_fdr_record_schema Component: c2_vpr (epic AZ-255 / E-C2) Tracker: AZ-337 Epic: AZ-255 (E-C2)

Document Dependencies

  • _docs/02_document/contracts/c2_vpr/vpr_strategy_protocol.md — Protocol contract this task implements (every invariant MUST be satisfied).
  • _docs/02_document/components/02_c2_vpr/description.md — § 1 PRIMARY backbone designation; § 2 interface; § 5 backbone weights ≤ 600 MB GPU; § 7 GPU stream race notes; § 9 logging.
  • _docs/02_document/module-layout.mdc2_vpr Per-Component Mapping (ultra_vpr.py Internal); BUILD_VPR_ULTRA_VPR row in build-time exclusion map (ON for airborne/research/replay-cli, OFF for operator-tooling).
  • _docs/02_document/contracts/c7_inference/inference_runtime_protocol.mdInferenceRuntime interface (engine load, forward pass, output extraction).
  • _docs/02_document/contracts/shared_helpers/descriptor_normaliser.md — L2 normalisation contract (UltraVPR raw embeddings are NOT L2-normalised; this task MUST normalise).
  • _docs/02_document/contracts/shared_helpers/engine_filename_schema.md — TensorRT engine filename → metadata extraction.
  • _docs/02_document/components/02_c2_vpr/tests.md — C2-IT-01 (recall@10 ≥ 0.95 on Derkachi); C2-IT-02 (VprResult invariants); C2-PT-01 (embed_query p95 ≤ 60 ms; combined ≤ 65 ms; ≤ 600 MB GPU; ≤ 200 MB sys mem).

Problem

UltraVPR is the production-default backbone (description.md § 1 "Documentary Lead PRIMARY backbone"). Without this task:

  • The composition root has no concrete strategy to wire when config.vpr.strategy = "ultra_vpr" (the default value); the airborne binary cannot start.
  • AC-2.1b (recall@10 ≥ 0.95 on Derkachi) — the highest-priority C2 acceptance criterion — has no producer; the suite-level FT-P-19 satellite re-loc test cannot pass.
  • AC-4.1 latency budget for VPR is allocated against UltraVPR specifically (60 ms embed_query); without the TRT-backed implementation, the budget is unconsumable and the E2E latency target (400 ms p95) cannot be validated.
  • The 600 MB GPU memory ceiling for backbone weights is enforced at the implementation layer; without it, no operator can validate the airborne deployment fits the Tier-1 Jetson Orin's GPU memory budget.
  • UltraVPR has a non-trivial input preprocessing contract (specific resize target, centre-crop, ImageNet mean/std normalisation, FP16 cast); without UltraVprBackbonePreprocessor, every consumer would re-derive the contract → silent recall regression.

Outcome

  • src/gps_denied_onboard/components/c2_vpr/ultra_vpr.py defining:
    • UltraVprStrategy class implementing the VprStrategy Protocol (AZ-336).
    • Constructor signature: __init__(self, runtime: InferenceRuntime, tile_store: TileStore, weights_path: Path, preprocessor: UltraVprBackbonePreprocessor, normaliser: DescriptorNormaliser, fdr_client: FdrClient).
    • embed_query(frame, calibration):
      1. tensor = self._preprocessor.preprocess(frame, calibration) (returns FP16 NCHW (1, 3, H, W)).
      2. raw = self._runtime.forward(self._engine_id, {"input": tensor})["embedding"] (returns FP16 (1, 512)).
      3. embedding = self._normaliser.l2_normalise(raw[0]) (returns FP16 (512,) with ||embedding||_2 == 1.0 ± 1e-3).
      4. Return VprQuery(frame_id, embedding, produced_at=monotonic_ns()).
      5. Catch RuntimeError / CudaError → wrap in VprBackboneError; emit ERROR log + FDR record kind="vpr.backbone_error".
    • retrieve_topk(query, k):
      1. distances, tile_ids = self._tile_store.faiss_topk(query.embedding, k) (delegates to C6 TileStore Public API).
      2. Build [VprCandidate(tile_id, distance, descriptor_dim=512) for ...].
      3. Return VprResult(query.frame_id, candidates, retrieved_at=monotonic_ns(), backbone_label="ultra_vpr").
      4. On IndexUnavailableError (raised by C6 TileStore on stale handle), re-raise unchanged.
    • descriptor_dim() -> int: returns 512 (the UltraVPR research code drop's published embedding dim; the value is asserted at engine-load time against the engine's output tensor shape; mismatch → RuntimeError at startup).
    • Module-level create(config, tile_store, inference_runtime) -> VprStrategy:
      1. Resolve weights_path = config.vpr.backbone_weights_path (a TensorRT engine file produced by C10's engine compiler — AZ-321 — with the AZ-281 self-describing filename schema).
      2. Construct UltraVprBackbonePreprocessor(input_shape=(384, 384), mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)) (parameters from the upstream UltraVPR config, hard-coded here per CONST.SRP — these are weights-coupled, not config-knobs).
      3. Construct DescriptorNormaliser (or fetch from helpers; AZ-283).
      4. Load engine via inference_runtime.load_engine(weights_path) — the engine ID is captured for later forward calls.
      5. Assert engine output shape == (1, 512) FP16; mismatch → ConfigurationError.
      6. Construct and return UltraVprStrategy(...).
  • src/gps_denied_onboard/components/c2_vpr/_preprocessor_ultra_vpr.py (or _preprocessor.py shared scaffolding + concrete UltraVprBackbonePreprocessor):
    • Implements BackbonePreprocessor Protocol from AZ-336.
    • preprocess(frame, calibration):
      1. Decode frame.image_bytes to RGB uint8 ndarray (H_in, W_in, 3) via OpenCV / Pillow.
      2. Centre-crop to a square region of side min(H_in, W_in) using calibration's principal point if non-centre (otherwise geometric centre). Calibration is consumed here for principal-point alignment per the upstream UltraVPR contract; if calibration is absent, fall back to geometric centre with a WARN log.
      3. Resize to (384, 384) via OpenCV INTER_AREA for downscale, INTER_CUBIC for upscale.
      4. Normalise: (pixel/255.0 - mean) / std per channel; cast to FP16.
      5. Transpose HWC → CHW; add batch dim → NCHW.
      6. Return ndarray of shape (1, 3, 384, 384) dtype float16.
    • input_shape() -> tuple[int, ...]: returns (384, 384).
    • On any preprocessing failure (corrupt image bytes, calibration mismatch), raise VprPreprocessError and emit ERROR log + FDR record kind="vpr.preprocess_error".
  • Composition-root wiring: runtime_root.compose_root includes a path that, when config.vpr.strategy == "ultra_vpr", calls UltraVprStrategy.create(config, tile_store, inference_runtime) via the AZ-336 factory.
  • Logging per description.md § 9:
    • INFO kind="c2.vpr.ready" with {strategy: "ultra_vpr", descriptor_dim: 512, corpus_size: <N>} after engine load.
    • WARN kind="c2.vpr.top1_distance_above_threshold" if top-1 distance > config.vpr.warn_top1_threshold (default 0.30).
    • ERROR kind="c2.vpr.backbone_error" and kind="c2.vpr.preprocess_error" per error path.
    • DEBUG kind="c2.vpr.frame_distances" with top-K distances per frame (gated by config; off by default to avoid log volume at 3 Hz).
  • FDR records emitted: kind="vpr.embed_query" (per frame, with frame_id + backbone_label + bbox of distances), kind="vpr.backbone_error" and kind="vpr.preprocess_error" (per error).

Scope

Included

  • UltraVprStrategy class implementing the VprStrategy Protocol exactly per the AZ-336 contract.
  • UltraVprBackbonePreprocessor implementing BackbonePreprocessor Protocol with the upstream UltraVPR's published preprocessing parameters.
  • Module-level create(config, tile_store, inference_runtime) factory entry-point.
  • Engine-output-shape assertion at load time ((1, 512) FP16); mismatch → ConfigurationError.
  • L2-normalisation of every embedding via the AZ-283 DescriptorNormaliser helper.
  • Composition-root wiring path for config.vpr.strategy == "ultra_vpr".
  • Logging per description.md § 9 (INFO ready, WARN top-1-above-threshold, ERROR error paths, DEBUG per-frame distances).
  • FDR record emission for embed-query and error paths.
  • Unit tests covering all 7 invariants (INV-1..INV-7), the engine-output-shape assertion, the preprocessing contract, the L2-normalisation post-condition, the composition-root wiring path.
  • BUILD_VPR_ULTRA_VPR CMake flag wiring (per ADR-002): the strategy module is excluded from the operator-tooling binary.

Excluded

  • The VprStrategy Protocol + BackbonePreprocessor Protocol + DTOs + errors + factory — owned by AZ-336.
  • The DescriptorNormaliser helper — already AZ-283.
  • The C7 InferenceRuntime (engine load + forward pass) — owned by AZ-298 (TensorRT runtime).
  • The C6 TileStore.faiss_topk query — owned by AZ-303 / AZ-306; this task consumes the Public API.
  • Engine compile (.onnx.trt) — owned by AZ-321 (c10_engine_compiler); this task consumes the produced .trt engine via config.vpr.backbone_weights_path.
  • Other backbones — AZ-338 (NetVLAD), AZ-339 (MegaLoc + MixVPR), AZ-340 (SelaVPR + EigenPlaces + SALAD).
  • FAISS HNSW wiring at the strategy level — retrieve_topk delegates to tile_store.faiss_topk; the FAISS index lifecycle (mmap, sidecar verify, handle invalidation) is owned by AZ-341.
  • Component-internal tests beyond Protocol + invariants + preprocessing-contract: C2-IT-01 (recall@10 acceptance test), C2-IT-03 (poisoned-tile), C2-IT-04 (scale-ratio), C2-PT-01 (latency NFR), C2-ST-01 (stale handle) are deferred to Step 9 / E-BBT.

Acceptance Criteria

AC-1: Protocol conformance Given a constructed UltraVprStrategy instance When isinstance(strategy, VprStrategy) is evaluated Then the result is True; the instance has embed_query, retrieve_topk, descriptor_dim

AC-2: embed_query produces L2-normalised FP16 (512,) embedding Given a valid NavCameraFrame and CameraCalibration When strategy.embed_query(frame, calibration) is called Then a VprQuery is returned with embedding.shape == (512,), embedding.dtype == np.float16, ||embedding||_2 == 1.0 ± 1e-3

AC-3: embed_query is deterministic (INV-2 + INV-6) Given the same frame + calibration When embed_query is called 3 times Then all three returns have bit-exact embedding arrays (ULP-tolerant for FP16); frame_id and produced_at differ across calls but embedding does not

AC-4: retrieve_topk returns exactly k candidates sorted ascending Given a corpus of 100 tiles loaded into C6 TileStore + a constructed VprQuery When strategy.retrieve_topk(query, k=10) is called Then len(candidates) == 10; [c.descriptor_distance for c in candidates] is non-strictly-ascending; backbone_label == "ultra_vpr"; candidates[0].descriptor_dim == 512

AC-5: descriptor_dim() is stable and returns 512 Given a constructed UltraVprStrategy When descriptor_dim() is called 100 times Then every call returns 512

AC-6: Engine output shape mismatch at load → ConfigurationError Given a TRT engine whose output tensor shape is (1, 256) (not 512) When UltraVprStrategy.create(config, tile_store, inference_runtime) is called Then ConfigurationError is raised with message containing "engine output shape mismatch: expected (1, 512), got (1, 256)"; the strategy is NOT instantiated

AC-7: VprBackboneError on forward-pass failure Given an InferenceRuntime test double that raises RuntimeError from forward When strategy.embed_query(frame, calibration) is called Then VprBackboneError is raised; ONE ERROR log kind="c2.vpr.backbone_error" is emitted; ONE FDR record kind="vpr.backbone_error" is emitted

AC-8: VprPreprocessError on corrupt image bytes Given a NavCameraFrame with malformed image_bytes (not decodable) When strategy.embed_query(frame, calibration) is called Then VprPreprocessError is raised; ONE ERROR log kind="c2.vpr.preprocess_error" is emitted; ONE FDR record kind="vpr.preprocess_error" is emitted

AC-9: Calibration absent → centre-crop falls back to geometric centre + WARN log Given a frame with calibration = None (or calibration.principal_point absent) When embed_query(frame, calibration) is called Then preprocessing succeeds with geometric-centre crop; ONE WARN log kind="c2.vpr.calibration_missing" is emitted; the embedding is L2-normalised (AC-2 still holds)

AC-10: IndexUnavailableError propagated unchanged from retrieve_topk Given a C6 TileStore test double that raises IndexUnavailableError from faiss_topk When strategy.retrieve_topk(query, k=10) is called Then IndexUnavailableError is raised unchanged (NOT wrapped); no candidates returned

AC-11: Composition-root wiring — config.vpr.strategy = "ultra_vpr" Given config.vpr.strategy = "ultra_vpr" AND a valid weights_path AND matching descriptor_dim in C6 sidecar When compose_root(config) runs Then a UltraVprStrategy instance is wired into the runtime root; the AZ-336 factory's pre-flight descriptor_dim validation passes; ONE INFO log kind="c2.vpr.ready" with {strategy: "ultra_vpr", descriptor_dim: 512, corpus_size: <N>} is emitted

AC-12: WARN log on top-1 distance above threshold Given config.vpr.warn_top1_threshold = 0.30 AND a VprResult whose top-1 descriptor_distance = 0.42 When retrieve_topk returns Then ONE WARN log kind="c2.vpr.top1_distance_above_threshold" with structured field {distance: 0.42, threshold: 0.30} is emitted

Non-Functional Requirements

Performance (deferred validation to C2-PT-01 / E-BBT; this task delivers the implementation):

  • embed_query p95 ≤ 60 ms on Tier-1 Jetson Orin with TensorRT 10.3 FP16 — bounded by the TRT engine forward-pass time + preprocessing overhead. The preprocessing path itself MUST be ≤ 5 ms p95 (so the TRT call has ~55 ms budget).
  • retrieve_topk p95 ≤ 2 ms — bounded by C6 FAISS HNSW; this task contributes only the Python wrapping overhead.
  • GPU memory: ≤ 600 MB resident for backbone weights (FP16 engine ~ 100-150 MB; remainder is workspace).
  • System memory: ≤ 200 MB for the mmap'd FAISS index handle (C6 owns this; this task consumes).

Compatibility

  • The TRT engine file format is owned by C10 / C7; this task consumes the produced .trt engine via config.vpr.backbone_weights_path. Engine version mismatches surface via the AZ-281 self-describing filename schema; the C7 load_engine enforces compatibility.
  • The upstream UltraVPR research code drop is pinned per Plan-phase; weight-format changes between drops would require a new engine build (C10) and a re-run of C2-IT-01 to confirm recall@10 still passes.

Reliability

  • Strategy is single-threaded by contract (INV-1, AZ-336); composition root binds to one ingest thread.
  • L2-normalisation is unconditional (INV-3); raw UltraVPR embeddings are not L2-normalised by the upstream forward pass.
  • VprBackboneError does not crash the process; downstream C5 falls back to VIO-only with provenance visual_propagated (AC-1.4).

Unit Tests

AC Ref What to Test Required Outcome
AC-1 isinstance(UltraVprStrategy(...), VprStrategy) True
AC-2 embed_query output shape (512,), dtype float16, L2-norm == 1.0 ± 1e-3
AC-3 embed_query × 3 same frame bit-exact embeddings (ULP-tolerant FP16)
AC-4 retrieve_topk against fixture corpus len == 10, sorted ascending, backbone_label == "ultra_vpr", descriptor_dim == 512
AC-5 descriptor_dim() × 100 always 512
AC-6 TRT engine with wrong output shape ConfigurationError at create time
AC-7 InferenceRuntime.forward raises VprBackboneError; ERROR log + FDR record
AC-8 malformed image_bytes VprPreprocessError; ERROR log + FDR record
AC-9 calibration = None preprocessing succeeds with geometric centre; WARN log
AC-10 tile_store.faiss_topk raises IndexUnavailableError propagated unchanged
AC-11 compose_root(config="ultra_vpr") wired; INFO log with {strategy, descriptor_dim, corpus_size}
AC-12 top-1 distance > threshold WARN log emitted
Preprocess-shape preprocessor.preprocess(frame) output shape (1, 3, 384, 384), dtype float16
Preprocess-mean-std preprocessing on a uniform-grey image per-channel (grey - mean) / std matches expected to ULP
Preprocess-input-shape preprocessor.input_shape() returns (384, 384)

Constraints

  • The BackbonePreprocessor instance for UltraVPR lives next to the strategy, NOT in helpers/ — preprocessing parameters are weights-coupled (description.md § 6 "C2-internal helper, NOT a shared helper").
  • Preprocessing parameters are hard-coded(384, 384) resize target, (0.485, 0.456, 0.406) ImageNet mean, (0.229, 0.224, 0.225) ImageNet std. These are weights-coupled per the upstream UltraVPR contract; making them config-knobs would let an operator silently break the AC-2.1b recall floor.
  • L2-normalisation is mandatory even though some downstream code paths are robust to non-normalised embeddings — INV-3 from the contract is non-negotiable.
  • Engine load happens at create time, NOT at first frame — the engine-output-shape assertion (AC-6) MUST fire at startup.
  • The strategy holds the engine ID returned by inference_runtime.load_engine, NOT the engine itself — engine lifecycle is owned by C7.
  • Constructor injection only — no import gps_denied_onboard.config inside the strategy module; config is consumed via the create factory.
  • No GPU operations outside embed_query__init__ does the engine load (one-time cost), embed_query does the per-frame forward pass; nothing else touches the GPU stream.

Risks & Mitigation

Risk 1: UltraVPR upstream code drop ships an unsupported ONNX op

  • Risk: The TRT 10.3 ONNX importer doesn't support a custom op in UltraVPR's graph; engine compilation fails at C10 stage.
  • Mitigation: Engine compile is C10's responsibility (AZ-321). This task consumes the produced engine and assumes it's loadable. If C10 cannot build the engine, the strategy cannot be wired — a hard upstream blocker that surfaces during AZ-321 implementation, NOT here.

Risk 2: FP16 precision insufficient for AC-2.1b recall@10 ≥ 0.95

  • Risk: FP16 quantisation degrades embedding fidelity below the recall floor on the Derkachi corpus.
  • Mitigation: C2-IT-01 (deferred to Step 9) is the validation gate. If FP16 fails, the operator can fall back to FP32 by rebuilding the engine via C10 with precision=fp32 — this is a config-time decision, NOT a code change in this task. The strategy treats FP16 vs FP32 as transparent (the engine output dtype is asserted at load time; embedding dtype follows the engine).

Risk 3: Centre-crop with calibration's principal point introduces non-determinism if calibration changes mid-flight

  • Risk: An operator hot-swaps calibration during flight; embeddings shift; recall drops silently.
  • Mitigation: Calibration changes mid-flight are forbidden by the broader F1 / F2 / F3 lifecycle (calibration is loaded once per flight at takeoff). If a future cycle adds hot-swap support, a separate task adds calibration-versioning to embeddings.

Risk 4: Per-frame DEBUG log volume at 3 Hz × 10 distances = 30 entries/sec

  • Risk: Default-on DEBUG logging floods journald.
  • Mitigation: DEBUG kind="c2.vpr.frame_distances" is gated by config.vpr.debug_per_frame_distances (default false); operators enable it only for forensic investigation of a specific flight.

Risk 5: WARN-threshold default (0.30) needs calibration

  • Risk: The 0.30 default threshold for top-1 distance WARN is a placeholder; production-tuned values come from FT-P-19 telemetry.
  • Mitigation: config.vpr.warn_top1_threshold is config-driven (default 0.30); a follow-up cycle will tune from real flight FDR data. The default is a conservative starting point that surfaces obvious false-positives without flooding logs.

Runtime Completeness

  • Named capability: production-default VprStrategy for top-K retrieval against the C6 FAISS corpus (architecture / E-C2 / solution.md "UltraVPR primary backbone" / AC-2.1b + AC-4.1).
  • Production code that must exist: real UltraVprStrategy calling real C7 InferenceRuntime.forward with a real TRT-compiled UltraVPR engine; real UltraVprBackbonePreprocessor performing real OpenCV resize + ImageNet normalisation + FP16 cast; real L2-normalisation via real DescriptorNormaliser; real composition-root wiring in runtime_root.compose_root for the ultra_vpr strategy choice.
  • Allowed external stubs: tests MAY use FakeInferenceRuntime returning pre-computed embeddings (AC-2..AC-7), FakeTileStore (AC-4 / AC-10 / AC-11), FakeFdrClient (verifying FDR record emission), a synthetic frame fixture for preprocessing tests; production wiring uses the real C7 + C6 + UltraVPR engine.
  • Unacceptable substitutes: a Python-only NumPy implementation of UltraVPR's forward pass (would not satisfy C2-PT-01 latency at 60 ms p95; would defeat the GPU-bound architectural choice); skipping L2-normalisation (would break INV-3 and downstream cosine-similarity assumptions); making preprocessing parameters config-knobs (would let operators silently break AC-2.1b); engine load at first frame instead of create time (would defer the engine-output-shape assertion past startup, defeating fail-fast); per-strategy thread safety (the contract is single-thread; adding locks would mask the composition-root binding bug if it ever broke); a "demo mode" that returns dummy embeddings to bypass the TRT engine.