Item 2 (C1) + item 3 batch 1 of ~5 (C2 VPR, C2.5 Rerank, C3 Matcher) of the cycle-1 component-description reconciliation called out in ripple_log_cycle1.md. For each touched description.md: - Add a "Cycle-1 operational reality" paragraph in section 1 that names the _STRATEGY_REGISTRY + register_airborne_strategies() runtime gate (AZ-591), the pre_constructed dict path through compose_root (AZ-618 umbrella), the per-component AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS row, and any cycle-1 strategy-default vs documented-primary disambiguation (net_vlad as the C2 default; xfeat parked from the C3 airborne registry). - Relax the OpenCV row in section 5 Key Dependencies to the D-CROSS-CVE-1 cycle-1 pin (>=4.11.0.86,<4.12) wherever the component imports cv2 (C2 preprocessors, C2.5 ORB placeholder, C3 RANSAC + reprojection). - Add a "Cycle-1 Tier-2 follow-up dependencies" subsection in section 7 only for components with a strategy module that is built but parked from the airborne registry (C3 xfeat). Refresh ripple_log_cycle1.md follow-up ordering with per-batch progress + extracted batch pattern so the next batch session has a self-contained recipe. Bump _autodev_state.md sub_step.detail to reflect batch 1 completion (10 components + 8 helpers + tests/ remain). Co-authored-by: Cursor <cursoragent@cursor.com>
7.9 KiB
C2 — Visual Place Recognition
1. High-Level Overview
Purpose: given the current NavCameraFrame, retrieve the top-K=10 candidate satellite tiles from the pre-cached corpus by descriptor similarity. C2 owns the retrieval step; C2.5 narrows K=10 → N=3 via inlier-based re-rank.
Architectural Pattern: Strategy — VprStrategy interface; concrete implementations (UltraVPR primary, MegaLoc secondary, MixVPR / SelaVPR / EigenPlaces / NetVLAD / SALAD additional candidates) selected at startup by config (ADR-001); build-time gated per-implementation by BUILD_* flags (ADR-002); composition-root wired (ADR-009).
Cycle-1 operational reality: the airborne binary wires C2 through the _STRATEGY_REGISTRY + register_airborne_strategies() runtime gate (AZ-591) on top of the build-flag matrix, and constructor injection flows through the pre_constructed dict passed to compose_root(config, pre_constructed=...) (AZ-618 umbrella → AZ-620 c6 storage phase + AZ-623 c7 inference phase). All seven backbones (ultra_vpr, net_vlad, mega_loc, mix_vpr, sela_vpr, eigen_places, salad) have wired strategy modules + _preprocessor_* siblings + _faiss_bridge; their BUILD_VPR_<variant> env flags default OFF (tests/CI must opt in per strategy — see runtime_root/vpr_factory.py::_is_build_flag_on). The cycle-1 C2VprConfig.strategy default is net_vlad (the mandatory simple-baseline per Plan-phase D-C2-1) — ultra_vpr remains the Documentary Lead's PRIMARY backbone but additionally requires a pre-compiled .trt engine produced by C10's engine compiler (AZ-321). The c2_vpr slot lists ("c6_descriptor_index", "c7_inference") in AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS; missing keys raise AirborneBootstrapError at composition time, not at first frame.
Upstream dependencies:
- Camera ingest thread →
NavCameraFrame(parallel fan-out with C1; same frame, distinct queue depth). - C7 InferenceRuntime → backbone forward pass (TRT/ONNX/PyTorch per active runtime).
- C6 DescriptorIndex → FAISS HNSW lookup over pre-cached tile descriptors.
- Camera calibration artifact — for backbone input preprocessing (resize/crop/normalise).
Downstream consumers:
- C2.5 ReRanker (consumes
VprResult).
2. Internal Interfaces
Interface: VprStrategy
| Method | Input | Output | Async | Error Types |
|---|---|---|---|---|
embed_query |
NavCameraFrame, CameraCalibration |
VprQuery |
No | VprBackboneError |
retrieve_topk |
VprQuery, k: int |
VprResult |
No | IndexUnavailableError, VprBackboneError |
descriptor_dim |
() |
int |
No | — |
Input DTOs:
NavCameraFrame: see C1 spec — same DTO
VprQuery:
frame_id: uuid (required)
embedding: ndarray[D, dtype=float16|float32] (required) — D depends on backbone
produced_at: monotonic_ns
Output DTOs:
VprResult:
frame_id: uuid
candidates: list[VprCandidate] (length = k, ranked by descriptor distance ascending)
retrieved_at: monotonic_ns
backbone_label: string — for FDR provenance
VprCandidate:
tile_id: composite (zoomLevel, lat, lon)
descriptor_distance: float — backbone-specific metric (cosine for L2-normalised embeddings)
descriptor_dim: int
3. External API Specification
Not applicable — internal-only component.
4. Data Access Patterns
Queries
| Query | Frequency | Hot Path | Index Needed |
|---|---|---|---|
| FAISS HNSW top-K=10 search | 3 Hz (per nav frame) | Yes | Yes — pre-built HNSW (C6) |
Caching Strategy
| Data | Cache Type | TTL | Invalidation |
|---|---|---|---|
| Backbone weights | TRT engine on disk + GPU resident | flight lifetime | Manifest content-hash gate (D-C10-3) at takeoff |
| FAISS HNSW index | mmap (C6 owns the file) | flight lifetime | Same as above |
Storage Estimates
C2 itself stores no persistent data; it consumes C6's descriptor index. Sizing belongs in C6.
Data Management
C2 is read-only against C6 during F3/F4/F6. Pre-flight, F1 triggers C10 (after C11 TileDownloader has populated C6) to call embed_query on every staged tile to populate the descriptor matrix consumed by C6.
5. Implementation Details
Algorithmic Complexity: HNSW search is O(log N) in corpus size for k=10; backbone forward pass is O(1) per frame (GPU-bound).
State Management: stateless per-frame; the only persistent state is the loaded backbone weights and the FAISS index pointer (held by C6 and passed in via constructor).
Key Dependencies:
| Library | Version | Purpose |
|---|---|---|
| FAISS (Python + C++) | upstream HEAD pinned per Plan-phase | HNSW retrieval; consumed via C6 |
| TensorRT | 10.3 (JetPack 6.2 pin) | Primary inference backend; consumed via C7 |
| ONNX Runtime + TRT EP | matches C7 | Fallback backend |
| PyTorch | matches simple-baseline track | FP16 baseline (NetVLAD / MixVPR mandatory) |
| UltraVPR (research code drop) | upstream HEAD pinned per Plan-phase | Documentary Lead PRIMARY backbone |
| MegaLoc, MixVPR, SelaVPR, EigenPlaces, NetVLAD | upstream HEAD pinned per Plan-phase | Secondary + mandatory simple-baselines |
OpenCV (cv2) |
>=4.11.0.86,<4.12 (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see _docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md) |
Image decode + colour-space conversions in the per-strategy _preprocessor_*.py modules |
Error Handling Strategy:
VprBackboneError: backbone forward pass failed (CUDA OOM, TRT engine deserialize mismatch). C2 emits noVprResult; C5 falls back to VIO-only with provenance labelvisual_propagated(AC-1.4).IndexUnavailableError: FAISS index handle invalid (e.g., post-F8 reboot before warm-up). Same fallback as above; F8 recovery flow re-mmaps the index.
6. Extensions and Helpers
| Helper | Purpose | Used By |
|---|---|---|
BackbonePreprocessor |
resize / crop / normalise per backbone's input contract | C2 only — keep inside the component, not a shared helper |
DescriptorNormaliser |
L2-normalise descriptors so cosine similarity aligns with Euclidean | C2 (query side), C10 (corpus side at cache artifact build) |
7. Caveats & Edge Cases
Known limitations:
- VPR is sensitive to scene change between cache build and flight time — AC-NEW-6 freshness gating is the project-level mitigation, not a C2 concern.
- Backbone choice is constrained by ADR-002: only the linked-in implementations are selectable at runtime.
Potential race conditions:
- Concurrent
embed_querycalls on a single strategy instance can race on the GPU stream. Bind one strategy instance to one ingest thread — composition root enforces.
Performance bottlenecks:
- Backbone forward pass is the dominant cost (~30–80 ms on Jetson per backbone). FAISS HNSW search is sub-millisecond for 100k-tile corpora.
- D-CROSS-LATENCY-1 hybrid does not change C2 behaviour — C2's budget is fixed; the auto-degrade happens at C4.
8. Dependency Graph
Must be implemented after: C6 (descriptor index), C7 (inference runtime), C10 (descriptor population at cache artifact build).
Can be implemented in parallel with: C1, C8 — independent paths.
Blocks: C2.5 (no candidates without VprResult), F3 / F6.
9. Logging Strategy
| Log Level | When | Example |
|---|---|---|
| ERROR | VprBackboneError or IndexUnavailableError |
VPR backbone OOM: backbone=ultravpr, frame=12345 |
| WARN | top-1 distance exceeds drift threshold (potential false-positive retrieval) | VPR top-1 distance 0.42 above warn threshold 0.30; backbone=ultravpr |
| INFO | Strategy ready; backbone loaded | VPR ready: backbone=ultravpr, dim=512, corpus_size=87654 |
| DEBUG | Per-frame top-K distances | VPR frame=12345 top10_distances=[0.12, 0.14, ...] |
Log format: structured JSON. Log storage: stdout / journald / FDR via C13 (ERROR + WARN only).