Files
gps-denied-onboard/docs/02_components/helpers/h04_faiss_index_manager_spec.md
T
2025-11-30 16:09:31 +02:00

4.3 KiB

Faiss Index Manager Helper

Interface Definition

Interface Name: IFaissIndexManager

Interface Methods

class IFaissIndexManager(ABC):
    @abstractmethod
    def build_index(self, descriptors: np.ndarray, index_type: str) -> FaissIndex:
        pass
    
    @abstractmethod
    def add_descriptors(self, index: FaissIndex, descriptors: np.ndarray) -> bool:
        pass
    
    @abstractmethod
    def search(self, index: FaissIndex, query: np.ndarray, k: int) -> Tuple[np.ndarray, np.ndarray]:
        pass
    
    @abstractmethod
    def save_index(self, index: FaissIndex, path: str) -> bool:
        pass
    
    @abstractmethod
    def load_index(self, path: str) -> FaissIndex:
        pass
    
    @abstractmethod
    def is_gpu_available(self) -> bool:
        pass
    
    @abstractmethod
    def set_device(self, device: str) -> bool:
        """Set device: 'gpu' or 'cpu'."""
        pass

Component Description

Manages Faiss indices for DINOv2 descriptor similarity search. H04 provides generic Faiss index operations used by:

Satellite Index (Primary Use Case)

  • Index format: IVF1000 (Inverted File with 1000 clusters)
  • Index source: Pre-built by external Satellite Provider (Maxar, Google Maps, Copernicus, etc.)
  • Index delivery: Provider delivers index file + tile metadata when tiles are fetched on demand
  • Index updates: Provider rebuilds index when new satellite tiles become available
  • Usage: F08 Global Place Recognition loads this index via H04.load_index()

UAV Index (Optional, Future Use)

For loop closure and chunk-to-chunk matching:

  1. Loop closure detection: Find when UAV revisits previously seen areas
  2. Chunk-to-chunk matching: Match disconnected chunks to each other
  3. Flight-to-flight matching: Match current flight to previous flights

Note: H04 is a low-level utility that manages ANY Faiss index. It does NOT know whether the index contains satellite or UAV descriptors.

API Methods

build_index(descriptors: np.ndarray, index_type: str) -> FaissIndex

Description: Builds Faiss index from descriptors.

Index Types:

  • "IVF": Inverted File (fast for large databases)
  • "HNSW": Hierarchical Navigable Small World (best accuracy/speed trade-off)
  • "Flat": Brute force (exact, slow for large datasets)

Input: (N, D) descriptors array


add_descriptors(index: FaissIndex, descriptors: np.ndarray) -> bool

Description: Adds more descriptors to existing index.


search(index: FaissIndex, query: np.ndarray, k: int) -> Tuple[np.ndarray, np.ndarray]

Description: Searches for k nearest neighbors.

Output: (distances, indices) - shape (k,)


save_index(index: FaissIndex, path: str) -> bool

Description: Saves index to disk for fast startup.


load_index(path: str) -> FaissIndex

Description: Loads pre-built index from disk.

Dependencies

External: faiss-gpu or faiss-cpu

GPU/CPU Fallback

H04 supports automatic fallback from GPU to CPU:

  • is_gpu_available(): Returns True if faiss-gpu is available and CUDA works
  • set_device("gpu"): Use GPU acceleration (faster for large indexes)
  • set_device("cpu"): Use CPU (fallback when GPU unavailable)

Current vs Future Use Cases

Current Use (MVP)

  • Satellite Index Loading: F08 uses load_index() to load pre-built satellite descriptor index from provider.
  • Similarity Search: F08 uses search() to find candidate satellite tiles.

Future Use Cases (build_index, add_descriptors)

The build_index() and add_descriptors() methods are reserved for future features:

  1. UAV Loop Closure Detection: Build index of UAV frame descriptors to detect when UAV revisits previously seen areas.
  2. Chunk-to-Chunk Matching: Build index of chunk descriptors for matching disconnected trajectory segments.
  3. Flight-to-Flight Matching: Match current flight to previous flights for multi-flight consistency.

Note: For MVP, F08 does NOT build satellite indexes - they are provided pre-built by the satellite data provider.

Test Cases

  1. Build index with 10,000 UAV image descriptors → succeeds
  2. Search query UAV descriptor → returns top-k similar UAV frames
  3. Save/load index → index restored correctly
  4. GPU unavailable → automatically falls back to CPU
  5. Add descriptors incrementally → index grows correctly