Files
gps-denied-onboard/_docs/02_document/components/02_c2_vpr/description.md
T
Oleksandr Bezdieniezhnykh 64542d32fc Update autodev state, architecture documentation, and glossary terms
Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
2026-05-10 00:21:34 +03:00

6.5 KiB
Raw Blame History

C2 — Visual Place Recognition

1. High-Level Overview

Purpose: given the current NavCameraFrame, retrieve the top-K=10 candidate satellite tiles from the pre-cached corpus by descriptor similarity. C2 owns the retrieval step; C2.5 narrows K=10 → N=3 via inlier-based re-rank.

Architectural Pattern: Strategy — VprStrategy interface; concrete implementations (UltraVPR primary, MegaLoc secondary, MixVPR / SelaVPR / EigenPlaces / NetVLAD / SALAD additional candidates) selected at startup by config (ADR-001); build-time gated per-implementation by BUILD_* flags (ADR-002); composition-root wired (ADR-009).

Upstream dependencies:

  • Camera ingest thread → NavCameraFrame (parallel fan-out with C1; same frame, distinct queue depth).
  • C7 InferenceRuntime → backbone forward pass (TRT/ONNX/PyTorch per active runtime).
  • C6 DescriptorIndex → FAISS HNSW lookup over pre-cached tile descriptors.
  • Camera calibration artifact — for backbone input preprocessing (resize/crop/normalise).

Downstream consumers:

  • C2.5 ReRanker (consumes VprResult).

2. Internal Interfaces

Interface: VprStrategy

Method Input Output Async Error Types
embed_query NavCameraFrame, CameraCalibration VprQuery No VprBackboneError
retrieve_topk VprQuery, k: int VprResult No IndexUnavailableError, VprBackboneError
descriptor_dim () int No

Input DTOs:

NavCameraFrame:                see C1 spec — same DTO

VprQuery:
  frame_id:        uuid (required)
  embedding:       ndarray[D, dtype=float16|float32] (required) — D depends on backbone
  produced_at:     monotonic_ns

Output DTOs:

VprResult:
  frame_id:        uuid
  candidates:      list[VprCandidate] (length = k, ranked by descriptor distance ascending)
  retrieved_at:    monotonic_ns
  backbone_label:  string — for FDR provenance

VprCandidate:
  tile_id:                composite (zoomLevel, lat, lon)
  descriptor_distance:    float — backbone-specific metric (cosine for L2-normalised embeddings)
  descriptor_dim:         int

3. External API Specification

Not applicable — internal-only component.

4. Data Access Patterns

Queries

Query Frequency Hot Path Index Needed
FAISS HNSW top-K=10 search 3 Hz (per nav frame) Yes Yes — pre-built HNSW (C6)

Caching Strategy

Data Cache Type TTL Invalidation
Backbone weights TRT engine on disk + GPU resident flight lifetime Manifest content-hash gate (D-C10-3) at takeoff
FAISS HNSW index mmap (C6 owns the file) flight lifetime Same as above

Storage Estimates

C2 itself stores no persistent data; it consumes C6's descriptor index. Sizing belongs in C6.

Data Management

C2 is read-only against C6 during F3/F4/F6. Pre-flight, F1 triggers C10 (after C11 TileDownloader has populated C6) to call embed_query on every staged tile to populate the descriptor matrix consumed by C6.

5. Implementation Details

Algorithmic Complexity: HNSW search is O(log N) in corpus size for k=10; backbone forward pass is O(1) per frame (GPU-bound).

State Management: stateless per-frame; the only persistent state is the loaded backbone weights and the FAISS index pointer (held by C6 and passed in via constructor).

Key Dependencies:

Library Version Purpose
FAISS (Python + C++) upstream HEAD pinned per Plan-phase HNSW retrieval; consumed via C6
TensorRT 10.3 (JetPack 6.2 pin) Primary inference backend; consumed via C7
ONNX Runtime + TRT EP matches C7 Fallback backend
PyTorch matches simple-baseline track FP16 baseline (NetVLAD / MixVPR mandatory)
UltraVPR (research code drop) upstream HEAD pinned per Plan-phase Documentary Lead PRIMARY backbone
MegaLoc, MixVPR, SelaVPR, EigenPlaces, NetVLAD upstream HEAD pinned per Plan-phase Secondary + mandatory simple-baselines

Error Handling Strategy:

  • VprBackboneError: backbone forward pass failed (CUDA OOM, TRT engine deserialize mismatch). C2 emits no VprResult; C5 falls back to VIO-only with provenance label visual_propagated (AC-1.4).
  • IndexUnavailableError: FAISS index handle invalid (e.g., post-F8 reboot before warm-up). Same fallback as above; F8 recovery flow re-mmaps the index.

6. Extensions and Helpers

Helper Purpose Used By
BackbonePreprocessor resize / crop / normalise per backbone's input contract C2 only — keep inside the component, not a shared helper
DescriptorNormaliser L2-normalise descriptors so cosine similarity aligns with Euclidean C2 (query side), C10 (corpus side at cache artifact build)

7. Caveats & Edge Cases

Known limitations:

  • VPR is sensitive to scene change between cache build and flight time — AC-NEW-6 freshness gating is the project-level mitigation, not a C2 concern.
  • Backbone choice is constrained by ADR-002: only the linked-in implementations are selectable at runtime.

Potential race conditions:

  • Concurrent embed_query calls on a single strategy instance can race on the GPU stream. Bind one strategy instance to one ingest thread — composition root enforces.

Performance bottlenecks:

  • Backbone forward pass is the dominant cost (~3080 ms on Jetson per backbone). FAISS HNSW search is sub-millisecond for 100k-tile corpora.
  • D-CROSS-LATENCY-1 hybrid does not change C2 behaviour — C2's budget is fixed; the auto-degrade happens at C4.

8. Dependency Graph

Must be implemented after: C6 (descriptor index), C7 (inference runtime), C10 (descriptor population at cache artifact build).

Can be implemented in parallel with: C1, C8 — independent paths.

Blocks: C2.5 (no candidates without VprResult), F3 / F6.

9. Logging Strategy

Log Level When Example
ERROR VprBackboneError or IndexUnavailableError VPR backbone OOM: backbone=ultravpr, frame=12345
WARN top-1 distance exceeds drift threshold (potential false-positive retrieval) VPR top-1 distance 0.42 above warn threshold 0.30; backbone=ultravpr
INFO Strategy ready; backbone loaded VPR ready: backbone=ultravpr, dim=512, corpus_size=87654
DEBUG Per-frame top-K distances VPR frame=12345 top10_distances=[0.12, 0.14, ...]

Log format: structured JSON. Log storage: stdout / journald / FDR via C13 (ERROR + WARN only).