# Faiss Index Manager Helper ## Interface Definition **Interface Name**: `IFaissIndexManager` ### Interface Methods ```python class IFaissIndexManager(ABC): @abstractmethod def build_index(self, descriptors: np.ndarray, index_type: str) -> FaissIndex: pass @abstractmethod def add_descriptors(self, index: FaissIndex, descriptors: np.ndarray) -> bool: pass @abstractmethod def search(self, index: FaissIndex, query: np.ndarray, k: int) -> Tuple[np.ndarray, np.ndarray]: pass @abstractmethod def save_index(self, index: FaissIndex, path: str) -> bool: pass @abstractmethod def load_index(self, path: str) -> FaissIndex: pass @abstractmethod def is_gpu_available(self) -> bool: pass @abstractmethod def set_device(self, device: str) -> bool: """Set device: 'gpu' or 'cpu'.""" pass ``` ## Component Description Manages Faiss indices for DINOv2 descriptor similarity search. H04 provides generic Faiss index operations used by: ### Satellite Index (Primary Use Case) - **Index format**: IVF1000 (Inverted File with 1000 clusters) - **Index source**: Pre-built by external Satellite Provider (Maxar, Google Maps, Copernicus, etc.) - **Index delivery**: Provider delivers index file + tile metadata when tiles are fetched on demand - **Index updates**: Provider rebuilds index when new satellite tiles become available - **Usage**: F08 Global Place Recognition loads this index via H04.load_index() ### UAV Index (Optional, Future Use) For loop closure and chunk-to-chunk matching: 1. **Loop closure detection**: Find when UAV revisits previously seen areas 2. **Chunk-to-chunk matching**: Match disconnected chunks to each other 3. **Flight-to-flight matching**: Match current flight to previous flights **Note**: H04 is a low-level utility that manages ANY Faiss index. It does NOT know whether the index contains satellite or UAV descriptors. ## API Methods ### `build_index(descriptors: np.ndarray, index_type: str) -> FaissIndex` **Description**: Builds Faiss index from descriptors. **Index Types**: - **"IVF"**: Inverted File (fast for large databases) - **"HNSW"**: Hierarchical Navigable Small World (best accuracy/speed trade-off) - **"Flat"**: Brute force (exact, slow for large datasets) **Input**: (N, D) descriptors array --- ### `add_descriptors(index: FaissIndex, descriptors: np.ndarray) -> bool` **Description**: Adds more descriptors to existing index. --- ### `search(index: FaissIndex, query: np.ndarray, k: int) -> Tuple[np.ndarray, np.ndarray]` **Description**: Searches for k nearest neighbors. **Output**: (distances, indices) - shape (k,) --- ### `save_index(index: FaissIndex, path: str) -> bool` **Description**: Saves index to disk for fast startup. --- ### `load_index(path: str) -> FaissIndex` **Description**: Loads pre-built index from disk. ## Dependencies **External**: faiss-gpu or faiss-cpu ## GPU/CPU Fallback H04 supports automatic fallback from GPU to CPU: - `is_gpu_available()`: Returns True if faiss-gpu is available and CUDA works - `set_device("gpu")`: Use GPU acceleration (faster for large indexes) - `set_device("cpu")`: Use CPU (fallback when GPU unavailable) ## Current vs Future Use Cases ### Current Use (MVP) - **Satellite Index Loading**: F08 uses `load_index()` to load pre-built satellite descriptor index from provider. - **Similarity Search**: F08 uses `search()` to find candidate satellite tiles. ### Future Use Cases (build_index, add_descriptors) The `build_index()` and `add_descriptors()` methods are reserved for future features: 1. **UAV Loop Closure Detection**: Build index of UAV frame descriptors to detect when UAV revisits previously seen areas. 2. **Chunk-to-Chunk Matching**: Build index of chunk descriptors for matching disconnected trajectory segments. 3. **Flight-to-Flight Matching**: Match current flight to previous flights for multi-flight consistency. **Note**: For MVP, F08 does NOT build satellite indexes - they are provided pre-built by the satellite data provider. ## Test Cases 1. Build index with 10,000 UAV image descriptors → succeeds 2. Search query UAV descriptor → returns top-k similar UAV frames 3. Save/load index → index restored correctly 4. GPU unavailable → automatically falls back to CPU 5. Add descriptors incrementally → index grows correctly