Files
gps-denied-onboard/_docs/02_document/components/02_c2_vpr/description.md
T
Oleksandr Bezdieniezhnykh c1f27e4681 [autodev] Step 13 partial: c1/c2/c2_5/c3 cycle-1 doc sync
Item 2 (C1) + item 3 batch 1 of ~5 (C2 VPR, C2.5 Rerank, C3 Matcher)
of the cycle-1 component-description reconciliation called out in
ripple_log_cycle1.md.

For each touched description.md:
- Add a "Cycle-1 operational reality" paragraph in section 1 that
  names the _STRATEGY_REGISTRY + register_airborne_strategies()
  runtime gate (AZ-591), the pre_constructed dict path through
  compose_root (AZ-618 umbrella), the per-component
  AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS row, and any cycle-1
  strategy-default vs documented-primary disambiguation
  (net_vlad as the C2 default; xfeat parked from the C3 airborne
  registry).
- Relax the OpenCV row in section 5 Key Dependencies to the
  D-CROSS-CVE-1 cycle-1 pin (>=4.11.0.86,<4.12) wherever the
  component imports cv2 (C2 preprocessors, C2.5 ORB placeholder,
  C3 RANSAC + reprojection).
- Add a "Cycle-1 Tier-2 follow-up dependencies" subsection in
  section 7 only for components with a strategy module that is
  built but parked from the airborne registry (C3 xfeat).

Refresh ripple_log_cycle1.md follow-up ordering with per-batch
progress + extracted batch pattern so the next batch session has
a self-contained recipe. Bump _autodev_state.md sub_step.detail
to reflect batch 1 completion (10 components + 8 helpers + tests/
remain).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-19 16:49:41 +03:00

7.9 KiB
Raw Blame History

C2 — Visual Place Recognition

1. High-Level Overview

Purpose: given the current NavCameraFrame, retrieve the top-K=10 candidate satellite tiles from the pre-cached corpus by descriptor similarity. C2 owns the retrieval step; C2.5 narrows K=10 → N=3 via inlier-based re-rank.

Architectural Pattern: Strategy — VprStrategy interface; concrete implementations (UltraVPR primary, MegaLoc secondary, MixVPR / SelaVPR / EigenPlaces / NetVLAD / SALAD additional candidates) selected at startup by config (ADR-001); build-time gated per-implementation by BUILD_* flags (ADR-002); composition-root wired (ADR-009).

Cycle-1 operational reality: the airborne binary wires C2 through the _STRATEGY_REGISTRY + register_airborne_strategies() runtime gate (AZ-591) on top of the build-flag matrix, and constructor injection flows through the pre_constructed dict passed to compose_root(config, pre_constructed=...) (AZ-618 umbrella → AZ-620 c6 storage phase + AZ-623 c7 inference phase). All seven backbones (ultra_vpr, net_vlad, mega_loc, mix_vpr, sela_vpr, eigen_places, salad) have wired strategy modules + _preprocessor_* siblings + _faiss_bridge; their BUILD_VPR_<variant> env flags default OFF (tests/CI must opt in per strategy — see runtime_root/vpr_factory.py::_is_build_flag_on). The cycle-1 C2VprConfig.strategy default is net_vlad (the mandatory simple-baseline per Plan-phase D-C2-1) — ultra_vpr remains the Documentary Lead's PRIMARY backbone but additionally requires a pre-compiled .trt engine produced by C10's engine compiler (AZ-321). The c2_vpr slot lists ("c6_descriptor_index", "c7_inference") in AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS; missing keys raise AirborneBootstrapError at composition time, not at first frame.

Upstream dependencies:

  • Camera ingest thread → NavCameraFrame (parallel fan-out with C1; same frame, distinct queue depth).
  • C7 InferenceRuntime → backbone forward pass (TRT/ONNX/PyTorch per active runtime).
  • C6 DescriptorIndex → FAISS HNSW lookup over pre-cached tile descriptors.
  • Camera calibration artifact — for backbone input preprocessing (resize/crop/normalise).

Downstream consumers:

  • C2.5 ReRanker (consumes VprResult).

2. Internal Interfaces

Interface: VprStrategy

Method Input Output Async Error Types
embed_query NavCameraFrame, CameraCalibration VprQuery No VprBackboneError
retrieve_topk VprQuery, k: int VprResult No IndexUnavailableError, VprBackboneError
descriptor_dim () int No

Input DTOs:

NavCameraFrame:                see C1 spec — same DTO

VprQuery:
  frame_id:        uuid (required)
  embedding:       ndarray[D, dtype=float16|float32] (required) — D depends on backbone
  produced_at:     monotonic_ns

Output DTOs:

VprResult:
  frame_id:        uuid
  candidates:      list[VprCandidate] (length = k, ranked by descriptor distance ascending)
  retrieved_at:    monotonic_ns
  backbone_label:  string — for FDR provenance

VprCandidate:
  tile_id:                composite (zoomLevel, lat, lon)
  descriptor_distance:    float — backbone-specific metric (cosine for L2-normalised embeddings)
  descriptor_dim:         int

3. External API Specification

Not applicable — internal-only component.

4. Data Access Patterns

Queries

Query Frequency Hot Path Index Needed
FAISS HNSW top-K=10 search 3 Hz (per nav frame) Yes Yes — pre-built HNSW (C6)

Caching Strategy

Data Cache Type TTL Invalidation
Backbone weights TRT engine on disk + GPU resident flight lifetime Manifest content-hash gate (D-C10-3) at takeoff
FAISS HNSW index mmap (C6 owns the file) flight lifetime Same as above

Storage Estimates

C2 itself stores no persistent data; it consumes C6's descriptor index. Sizing belongs in C6.

Data Management

C2 is read-only against C6 during F3/F4/F6. Pre-flight, F1 triggers C10 (after C11 TileDownloader has populated C6) to call embed_query on every staged tile to populate the descriptor matrix consumed by C6.

5. Implementation Details

Algorithmic Complexity: HNSW search is O(log N) in corpus size for k=10; backbone forward pass is O(1) per frame (GPU-bound).

State Management: stateless per-frame; the only persistent state is the loaded backbone weights and the FAISS index pointer (held by C6 and passed in via constructor).

Key Dependencies:

Library Version Purpose
FAISS (Python + C++) upstream HEAD pinned per Plan-phase HNSW retrieval; consumed via C6
TensorRT 10.3 (JetPack 6.2 pin) Primary inference backend; consumed via C7
ONNX Runtime + TRT EP matches C7 Fallback backend
PyTorch matches simple-baseline track FP16 baseline (NetVLAD / MixVPR mandatory)
UltraVPR (research code drop) upstream HEAD pinned per Plan-phase Documentary Lead PRIMARY backbone
MegaLoc, MixVPR, SelaVPR, EigenPlaces, NetVLAD upstream HEAD pinned per Plan-phase Secondary + mandatory simple-baselines
OpenCV (cv2) >=4.11.0.86,<4.12 (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see _docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md) Image decode + colour-space conversions in the per-strategy _preprocessor_*.py modules

Error Handling Strategy:

  • VprBackboneError: backbone forward pass failed (CUDA OOM, TRT engine deserialize mismatch). C2 emits no VprResult; C5 falls back to VIO-only with provenance label visual_propagated (AC-1.4).
  • IndexUnavailableError: FAISS index handle invalid (e.g., post-F8 reboot before warm-up). Same fallback as above; F8 recovery flow re-mmaps the index.

6. Extensions and Helpers

Helper Purpose Used By
BackbonePreprocessor resize / crop / normalise per backbone's input contract C2 only — keep inside the component, not a shared helper
DescriptorNormaliser L2-normalise descriptors so cosine similarity aligns with Euclidean C2 (query side), C10 (corpus side at cache artifact build)

7. Caveats & Edge Cases

Known limitations:

  • VPR is sensitive to scene change between cache build and flight time — AC-NEW-6 freshness gating is the project-level mitigation, not a C2 concern.
  • Backbone choice is constrained by ADR-002: only the linked-in implementations are selectable at runtime.

Potential race conditions:

  • Concurrent embed_query calls on a single strategy instance can race on the GPU stream. Bind one strategy instance to one ingest thread — composition root enforces.

Performance bottlenecks:

  • Backbone forward pass is the dominant cost (~3080 ms on Jetson per backbone). FAISS HNSW search is sub-millisecond for 100k-tile corpora.
  • D-CROSS-LATENCY-1 hybrid does not change C2 behaviour — C2's budget is fixed; the auto-degrade happens at C4.

8. Dependency Graph

Must be implemented after: C6 (descriptor index), C7 (inference runtime), C10 (descriptor population at cache artifact build).

Can be implemented in parallel with: C1, C8 — independent paths.

Blocks: C2.5 (no candidates without VprResult), F3 / F6.

9. Logging Strategy

Log Level When Example
ERROR VprBackboneError or IndexUnavailableError VPR backbone OOM: backbone=ultravpr, frame=12345
WARN top-1 distance exceeds drift threshold (potential false-positive retrieval) VPR top-1 distance 0.42 above warn threshold 0.30; backbone=ultravpr
INFO Strategy ready; backbone loaded VPR ready: backbone=ultravpr, dim=512, corpus_size=87654
DEBUG Per-frame top-K distances VPR frame=12345 top10_distances=[0.12, 0.14, ...]

Log format: structured JSON. Log storage: stdout / journald / FDR via C13 (ERROR + WARN only).