mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-04-23 02:46:36 +00:00
initial structure implemented
docs -> _docs
This commit is contained in:
@@ -0,0 +1,64 @@
|
||||
# Feature: Index Management
|
||||
|
||||
## Description
|
||||
Load and manage pre-built satellite descriptor database (Faiss index). The semantic index is built by the satellite data provider offline using DINOv2 + VLAD - F08 only loads and validates the index. This is the foundation for all place recognition queries.
|
||||
|
||||
## Component APIs Implemented
|
||||
- `load_index(flight_id: str, index_path: str) -> bool`
|
||||
|
||||
## External Tools and Services
|
||||
- **H04 Faiss Index Manager**: For `load_index()`, `validate_index()` operations
|
||||
- **Faiss**: Facebook similarity search library
|
||||
|
||||
## Internal Methods
|
||||
|
||||
### `_validate_index_integrity(index) -> bool`
|
||||
Validates loaded Faiss index: checks descriptor dimensions (4096 or 8192), verifies tile count matches metadata.
|
||||
|
||||
### `_load_tile_metadata(metadata_path: str) -> Dict[int, TileMetadata]`
|
||||
Loads tile_id → gps_center, bounds mapping from JSON file provided by satellite provider.
|
||||
|
||||
### `_verify_metadata_alignment(index, metadata: Dict) -> bool`
|
||||
Ensures metadata entries match index size (same number of tiles).
|
||||
|
||||
## Unit Tests
|
||||
|
||||
### Test: Load Valid Index
|
||||
- Input: Valid index file path from satellite provider
|
||||
- Verify: Returns True, index operational
|
||||
|
||||
### Test: Index Not Found
|
||||
- Input: Non-existent path
|
||||
- Verify: Raises `IndexNotFoundError`
|
||||
|
||||
### Test: Corrupted Index
|
||||
- Input: Corrupted/truncated index file
|
||||
- Verify: Raises `IndexCorruptedError`
|
||||
|
||||
### Test: Dimension Validation
|
||||
- Input: Index with wrong descriptor dimensions
|
||||
- Verify: Raises `IndexCorruptedError` with descriptive message
|
||||
|
||||
### Test: Metadata Mismatch
|
||||
- Input: Index with 1000 entries, metadata with 500 entries
|
||||
- Verify: Raises `MetadataMismatchError`
|
||||
|
||||
### Test: Empty Metadata File
|
||||
- Input: Valid index, empty metadata JSON
|
||||
- Verify: Raises `MetadataMismatchError`
|
||||
|
||||
### Test: Load Performance
|
||||
- Input: Index with 10,000 tiles
|
||||
- Verify: Load completes in <10 seconds
|
||||
|
||||
## Integration Tests
|
||||
|
||||
### Test: Faiss Manager Integration
|
||||
- Verify: Successfully delegates index loading to H04 Faiss Index Manager
|
||||
- Verify: Index accessible for subsequent queries
|
||||
|
||||
### Test: Query After Load
|
||||
- Setup: Load valid index
|
||||
- Action: Query with random descriptor
|
||||
- Verify: Returns valid matches (not empty, valid indices)
|
||||
|
||||
+95
@@ -0,0 +1,95 @@
|
||||
# Feature: Descriptor Computation
|
||||
|
||||
## Description
|
||||
Compute global location descriptors using DINOv2 + VLAD aggregation. Supports both single-image descriptors and aggregate chunk descriptors for robust matching. DINOv2 features are semantic and invariant to season/texture changes, critical for UAV-to-satellite domain gap.
|
||||
|
||||
## Component APIs Implemented
|
||||
- `compute_location_descriptor(image: np.ndarray) -> np.ndarray`
|
||||
- `compute_chunk_descriptor(chunk_images: List[np.ndarray]) -> np.ndarray`
|
||||
|
||||
## External Tools and Services
|
||||
- **F16 Model Manager**: Provides DINOv2 inference engine via `get_inference_engine("DINOv2")`
|
||||
- **DINOv2**: Meta's foundation vision model for semantic feature extraction
|
||||
- **numpy**: Array operations and L2 normalization
|
||||
|
||||
## Internal Methods
|
||||
|
||||
### `_preprocess_image(image: np.ndarray) -> np.ndarray`
|
||||
Resizes and normalizes image for DINOv2 input (typically 224x224 or 518x518).
|
||||
|
||||
### `_extract_dense_features(preprocessed: np.ndarray) -> np.ndarray`
|
||||
Runs DINOv2 inference, extracts dense feature map from multiple spatial locations.
|
||||
|
||||
### `_vlad_aggregate(dense_features: np.ndarray, codebook: np.ndarray) -> np.ndarray`
|
||||
Applies VLAD (Vector of Locally Aggregated Descriptors) aggregation using pre-trained cluster centers.
|
||||
|
||||
### `_l2_normalize(descriptor: np.ndarray) -> np.ndarray`
|
||||
L2-normalizes descriptor vector for cosine similarity search.
|
||||
|
||||
### `_aggregate_chunk_descriptors(descriptors: List[np.ndarray], strategy: str) -> np.ndarray`
|
||||
Aggregates multiple descriptors into one using strategy: "mean" (default), "vlad", or "max".
|
||||
|
||||
## Unit Tests
|
||||
|
||||
### compute_location_descriptor
|
||||
|
||||
#### Test: Output Dimensions
|
||||
- Input: UAV image (any resolution)
|
||||
- Verify: Returns 4096-dim or 8192-dim vector
|
||||
|
||||
#### Test: Normalization
|
||||
- Input: Any valid image
|
||||
- Verify: Output L2 norm equals 1.0
|
||||
|
||||
#### Test: Deterministic Output
|
||||
- Input: Same image twice
|
||||
- Verify: Identical descriptors (no randomness)
|
||||
|
||||
#### Test: Season Invariance
|
||||
- Input: Two images of same location, different seasons
|
||||
- Verify: Cosine similarity > 0.7
|
||||
|
||||
#### Test: Location Discrimination
|
||||
- Input: Two images of different locations
|
||||
- Verify: Cosine similarity < 0.5
|
||||
|
||||
#### Test: Domain Invariance
|
||||
- Input: UAV image and satellite image of same location
|
||||
- Verify: Cosine similarity > 0.6 (cross-domain)
|
||||
|
||||
### compute_chunk_descriptor
|
||||
|
||||
#### Test: Empty Chunk
|
||||
- Input: Empty list
|
||||
- Verify: Raises ValueError
|
||||
|
||||
#### Test: Single Image Chunk
|
||||
- Input: List with one image
|
||||
- Verify: Equivalent to compute_location_descriptor
|
||||
|
||||
#### Test: Multiple Images Mean Aggregation
|
||||
- Input: 5 images from chunk
|
||||
- Verify: Descriptor is mean of individual descriptors
|
||||
|
||||
#### Test: Aggregated Normalization
|
||||
- Input: Any chunk
|
||||
- Verify: Output L2 norm equals 1.0
|
||||
|
||||
#### Test: Chunk More Robust Than Single
|
||||
- Input: Featureless terrain chunk (10 images)
|
||||
- Verify: Chunk descriptor has lower variance than individual descriptors
|
||||
|
||||
## Integration Tests
|
||||
|
||||
### Test: Model Manager Integration
|
||||
- Verify: Successfully retrieves DINOv2 from F16 Model Manager
|
||||
- Verify: Model loaded with correct TensorRT/ONNX backend
|
||||
|
||||
### Test: Performance Budget
|
||||
- Input: FullHD image
|
||||
- Verify: Single descriptor computed in ~150ms
|
||||
|
||||
### Test: Chunk Performance
|
||||
- Input: 10 images
|
||||
- Verify: Chunk descriptor in <2s (parallelizable)
|
||||
|
||||
@@ -0,0 +1,130 @@
|
||||
# Feature: Candidate Retrieval
|
||||
|
||||
## Description
|
||||
Query satellite database and retrieve ranked tile candidates. Orchestrates the full place recognition pipeline: descriptor computation → database query → candidate ranking. Supports both single-image and chunk-based retrieval for "kidnapped robot" recovery.
|
||||
|
||||
## Component APIs Implemented
|
||||
- `query_database(descriptor: np.ndarray, top_k: int) -> List[DatabaseMatch]`
|
||||
- `rank_candidates(candidates: List[TileCandidate]) -> List[TileCandidate]`
|
||||
- `retrieve_candidate_tiles(image: np.ndarray, top_k: int) -> List[TileCandidate]`
|
||||
- `retrieve_candidate_tiles_for_chunk(chunk_images: List[np.ndarray], top_k: int) -> List[TileCandidate]`
|
||||
|
||||
## External Tools and Services
|
||||
- **H04 Faiss Index Manager**: For `search()` operation
|
||||
- **F04 Satellite Data Manager**: For tile metadata retrieval (gps_center, bounds) after Faiss returns indices
|
||||
- **F12 Route Chunk Manager**: Provides chunk images for chunk-based retrieval
|
||||
|
||||
## Internal Methods
|
||||
|
||||
### `_distance_to_similarity(distance: float) -> float`
|
||||
Converts L2 distance to normalized similarity score [0, 1].
|
||||
|
||||
### `_retrieve_tile_metadata(indices: List[int]) -> List[TileMetadata]`
|
||||
Fetches GPS center and bounds for tile indices from F04 Satellite Data Manager.
|
||||
|
||||
### `_apply_spatial_reranking(candidates: List[TileCandidate], dead_reckoning_estimate: Optional[GPSPoint]) -> List[TileCandidate]`
|
||||
Re-ranks candidates based on proximity to dead-reckoning estimate if available.
|
||||
|
||||
### `_apply_trajectory_reranking(candidates: List[TileCandidate], previous_trajectory: Optional[List[GPSPoint]]) -> List[TileCandidate]`
|
||||
Favors tiles that continue the previous trajectory direction.
|
||||
|
||||
### `_filter_by_geofence(candidates: List[TileCandidate], geofence: Optional[BoundingBox]) -> List[TileCandidate]`
|
||||
Removes candidates outside operational geofence.
|
||||
|
||||
## Unit Tests
|
||||
|
||||
### query_database
|
||||
|
||||
#### Test: Returns Top-K Matches
|
||||
- Input: Valid descriptor, top_k=5
|
||||
- Verify: Returns exactly 5 DatabaseMatch objects
|
||||
|
||||
#### Test: Ordered by Distance
|
||||
- Input: Any descriptor
|
||||
- Verify: Matches sorted by ascending distance
|
||||
|
||||
#### Test: Similarity Score Range
|
||||
- Input: Any query
|
||||
- Verify: All similarity_score values in [0, 1]
|
||||
|
||||
#### Test: Empty Database
|
||||
- Input: Query when no index loaded
|
||||
- Verify: Returns empty list (not exception)
|
||||
|
||||
#### Test: Query Performance
|
||||
- Input: Large database (10,000 tiles)
|
||||
- Verify: Query completes in <50ms
|
||||
|
||||
### rank_candidates
|
||||
|
||||
#### Test: Preserves Order Without Heuristics
|
||||
- Input: Candidates without dead-reckoning estimate
|
||||
- Verify: Order unchanged (similarity-based)
|
||||
|
||||
#### Test: Spatial Reranking Applied
|
||||
- Input: Candidates + dead-reckoning estimate
|
||||
- Verify: Closer tile promoted in ranking
|
||||
|
||||
#### Test: Tie Breaking
|
||||
- Input: Two candidates with similar similarity scores
|
||||
- Verify: Spatial proximity breaks tie
|
||||
|
||||
#### Test: Geofence Filtering
|
||||
- Input: Candidates with some outside geofence
|
||||
- Verify: Out-of-bounds candidates removed
|
||||
|
||||
### retrieve_candidate_tiles
|
||||
|
||||
#### Test: End-to-End Single Image
|
||||
- Input: UAV image, top_k=5
|
||||
- Verify: Returns 5 TileCandidate with valid gps_center
|
||||
|
||||
#### Test: Correct Tile in Top-5
|
||||
- Input: UAV image with known location
|
||||
- Verify: Correct tile appears in top-5 (Recall@5 test)
|
||||
|
||||
#### Test: Performance Budget
|
||||
- Input: FullHD UAV image
|
||||
- Verify: Total time <200ms (descriptor ~150ms + query ~50ms)
|
||||
|
||||
### retrieve_candidate_tiles_for_chunk
|
||||
|
||||
#### Test: End-to-End Chunk
|
||||
- Input: 10 chunk images, top_k=5
|
||||
- Verify: Returns 5 TileCandidate
|
||||
|
||||
#### Test: Chunk More Accurate Than Single
|
||||
- Input: Featureless terrain images
|
||||
- Verify: Chunk retrieval finds correct tile where single-image fails
|
||||
|
||||
#### Test: Recall@5 > 90%
|
||||
- Input: Various chunk scenarios
|
||||
- Verify: Correct tile in top-5 at least 90% of test cases
|
||||
|
||||
## Integration Tests
|
||||
|
||||
### Test: Faiss Manager Integration
|
||||
- Verify: query_database correctly delegates to H04 Faiss Index Manager
|
||||
|
||||
### Test: Satellite Data Manager Integration
|
||||
- Verify: Tile metadata correctly retrieved from F04 after Faiss query
|
||||
|
||||
### Test: Full Pipeline Single Image
|
||||
- Setup: Load index, prepare UAV image
|
||||
- Action: retrieve_candidate_tiles()
|
||||
- Verify: Returns valid candidates with GPS coordinates
|
||||
|
||||
### Test: Full Pipeline Chunk
|
||||
- Setup: Load index, prepare chunk images
|
||||
- Action: retrieve_candidate_tiles_for_chunk()
|
||||
- Verify: Returns valid candidates, more robust than single-image
|
||||
|
||||
### Test: Season Invariance
|
||||
- Setup: Satellite tiles from summer, UAV image from autumn
|
||||
- Action: retrieve_candidate_tiles()
|
||||
- Verify: Correct match despite appearance change
|
||||
|
||||
### Test: Recall@5 Benchmark
|
||||
- Input: Test dataset of 100 UAV images with ground truth
|
||||
- Verify: Recall@5 > 85% for single-image, > 90% for chunk
|
||||
|
||||
+424
@@ -0,0 +1,424 @@
|
||||
# Global Place Recognition
|
||||
|
||||
## Interface Definition
|
||||
|
||||
**Interface Name**: `IGlobalPlaceRecognition`
|
||||
|
||||
### Interface Methods
|
||||
|
||||
```python
|
||||
class IGlobalPlaceRecognition(ABC):
|
||||
@abstractmethod
|
||||
def retrieve_candidate_tiles(self, image: np.ndarray, top_k: int) -> List[TileCandidate]:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def compute_location_descriptor(self, image: np.ndarray) -> np.ndarray:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def query_database(self, descriptor: np.ndarray, top_k: int) -> List[DatabaseMatch]:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def rank_candidates(self, candidates: List[TileCandidate]) -> List[TileCandidate]:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def load_index(self, flight_id: str, index_path: str) -> bool:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def retrieve_candidate_tiles_for_chunk(self, chunk_images: List[np.ndarray], top_k: int) -> List[TileCandidate]:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def compute_chunk_descriptor(self, chunk_images: List[np.ndarray]) -> np.ndarray:
|
||||
pass
|
||||
```
|
||||
|
||||
## Component Description
|
||||
|
||||
### Responsibilities
|
||||
- AnyLoc (DINOv2 + VLAD) for coarse localization after tracking loss
|
||||
- "Kidnapped robot" recovery after sharp turns
|
||||
- Compute image descriptors robust to season/appearance changes
|
||||
- Query Faiss index of satellite tile descriptors
|
||||
- Return top-k candidate tile regions for progressive refinement
|
||||
- **Load pre-built satellite descriptor index** (index is built by satellite provider, NOT by F08)
|
||||
- **Chunk semantic matching (aggregate DINOv2 features)**
|
||||
- **Chunk descriptor computation for robust matching**
|
||||
|
||||
### Scope
|
||||
- Global localization (not frame-to-frame)
|
||||
- Appearance-based place recognition
|
||||
- Handles domain gap (UAV vs satellite imagery)
|
||||
- Semantic feature extraction (DINOv2)
|
||||
- Efficient similarity search (Faiss)
|
||||
- **Chunk-level matching (more robust than single-image)**
|
||||
|
||||
## API Methods
|
||||
|
||||
### `retrieve_candidate_tiles(image: np.ndarray, top_k: int) -> List[TileCandidate]`
|
||||
|
||||
**Description**: Retrieves top-k candidate satellite tiles for a UAV image.
|
||||
|
||||
**Called By**:
|
||||
- F11 Failure Recovery Coordinator (after tracking loss)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
image: np.ndarray # UAV image
|
||||
top_k: int # Number of candidates (typically 5)
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
List[TileCandidate]:
|
||||
tile_id: str
|
||||
gps_center: GPSPoint
|
||||
similarity_score: float
|
||||
rank: int
|
||||
```
|
||||
|
||||
**Processing Flow**:
|
||||
1. compute_location_descriptor(image) → descriptor
|
||||
2. query_database(descriptor, top_k) → database_matches
|
||||
3. Retrieve tile metadata for matches
|
||||
4. rank_candidates() → sorted by similarity
|
||||
5. Return top-k candidates
|
||||
|
||||
**Error Conditions**:
|
||||
- Returns empty list: Database not initialized, query failed
|
||||
|
||||
**Test Cases**:
|
||||
1. **UAV image over Ukraine**: Returns relevant tiles
|
||||
2. **Different season**: DINOv2 handles appearance change
|
||||
3. **Top-1 accuracy**: Correct tile in top-5 > 85%
|
||||
|
||||
---
|
||||
|
||||
### `compute_location_descriptor(image: np.ndarray) -> np.ndarray`
|
||||
|
||||
**Description**: Computes global descriptor using DINOv2 + VLAD aggregation.
|
||||
|
||||
**Called By**:
|
||||
- Internal (during retrieve_candidate_tiles)
|
||||
- System initialization (for satellite database)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
image: np.ndarray # UAV or satellite image
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
np.ndarray: Descriptor vector (4096-dim or 8192-dim)
|
||||
```
|
||||
|
||||
**Algorithm (AnyLoc)**:
|
||||
1. Extract DINOv2 features (dense feature map)
|
||||
2. Apply VLAD (Vector of Locally Aggregated Descriptors) aggregation
|
||||
3. L2-normalize descriptor
|
||||
4. Return compact global descriptor
|
||||
|
||||
**Processing Details**:
|
||||
- Uses F16 Model Manager to get DINOv2 model
|
||||
- Dense features: extracts from multiple spatial locations
|
||||
- VLAD codebook: pre-trained cluster centers
|
||||
- Semantic features: invariant to texture/color changes
|
||||
|
||||
**Performance**:
|
||||
- Inference time: ~150ms for DINOv2 + VLAD
|
||||
|
||||
**Test Cases**:
|
||||
1. **Same location, different season**: Similar descriptors
|
||||
2. **Different locations**: Dissimilar descriptors
|
||||
3. **UAV vs satellite**: Domain-invariant features
|
||||
|
||||
---
|
||||
|
||||
### `query_database(descriptor: np.ndarray, top_k: int) -> List[DatabaseMatch]`
|
||||
|
||||
**Description**: Queries Faiss index for most similar satellite tiles.
|
||||
|
||||
**Called By**:
|
||||
- Internal (during retrieve_candidate_tiles)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
descriptor: np.ndarray # Query descriptor
|
||||
top_k: int
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
List[DatabaseMatch]:
|
||||
index: int # Tile index in database
|
||||
distance: float # L2 distance
|
||||
similarity_score: float # Normalized score
|
||||
```
|
||||
|
||||
**Processing Details**:
|
||||
- Uses H04 Faiss Index Manager
|
||||
- Index type: IVF (Inverted File) or HNSW for fast search
|
||||
- Distance metric: L2 (Euclidean)
|
||||
- Query time: ~10-50ms for 10,000+ tiles
|
||||
|
||||
**Error Conditions**:
|
||||
- Returns empty list: Query failed
|
||||
|
||||
**Test Cases**:
|
||||
1. **Query satellite database**: Returns top-5 matches
|
||||
2. **Large database (10,000 tiles)**: Fast retrieval (<50ms)
|
||||
|
||||
---
|
||||
|
||||
### `rank_candidates(candidates: List[TileCandidate]) -> List[TileCandidate]`
|
||||
|
||||
**Description**: Re-ranks candidates based on additional heuristics.
|
||||
|
||||
**Called By**:
|
||||
- Internal (during retrieve_candidate_tiles)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
candidates: List[TileCandidate] # Initial ranking by similarity
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
List[TileCandidate] # Re-ranked list
|
||||
```
|
||||
|
||||
**Re-ranking Factors**:
|
||||
1. **Similarity score**: Primary factor
|
||||
2. **Spatial proximity**: Prefer tiles near dead-reckoning estimate
|
||||
3. **Previous trajectory**: Favor continuation of route
|
||||
4. **Geofence constraints**: Within operational area
|
||||
|
||||
**Test Cases**:
|
||||
1. **Spatial re-ranking**: Closer tile promoted
|
||||
2. **Similar scores**: Spatial proximity breaks tie
|
||||
|
||||
---
|
||||
|
||||
### `load_index(flight_id: str, index_path: str) -> bool`
|
||||
|
||||
**Description**: Loads pre-built satellite descriptor database from file. **Note**: The semantic index (DINOv2 descriptors + Faiss index) MUST be provided by the satellite data provider. F08 does NOT build the index - it only loads it.
|
||||
|
||||
**Called By**:
|
||||
- F02.1 Flight Lifecycle Manager (during flight initialization, index_path from F04 Satellite Data Manager)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
index_path: str # Path to pre-built Faiss index file from satellite provider
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
bool: True if database loaded successfully
|
||||
```
|
||||
|
||||
**Processing Flow**:
|
||||
1. Load pre-built Faiss index from index_path
|
||||
2. Load tile metadata (tile_id → gps_center, bounds mapping)
|
||||
3. Validate index integrity (check descriptor dimensions, tile count)
|
||||
4. Return True if loaded successfully
|
||||
|
||||
**Satellite Provider Responsibility**:
|
||||
- Satellite provider builds the semantic index offline using DINOv2 + VLAD
|
||||
- Provider delivers index file along with satellite tiles
|
||||
- **Index format**: Faiss IVF1000 (Inverted File with 1000 clusters) + tile metadata JSON
|
||||
- Provider is responsible for index updates when satellite data changes
|
||||
- Index is rebuilt by provider whenever new satellite tiles are fetched on demand
|
||||
- Supported providers: Maxar, Google Maps, Copernicus, etc.
|
||||
|
||||
**Error Conditions**:
|
||||
- Raises `IndexNotFoundError`: Index file not found
|
||||
- Raises `IndexCorruptedError`: Index file corrupted or invalid format
|
||||
- Raises `MetadataMismatchError`: Metadata doesn't match index
|
||||
|
||||
**Performance**:
|
||||
- **Load time**: <10 seconds for 10,000+ tiles
|
||||
|
||||
**Test Cases**:
|
||||
1. **Load valid index**: Completes successfully, index operational
|
||||
2. **Index not found**: Raises IndexNotFoundError
|
||||
3. **Corrupted index**: Raises IndexCorruptedError
|
||||
4. **Index query after load**: Works correctly
|
||||
|
||||
---
|
||||
|
||||
### `retrieve_candidate_tiles_for_chunk(chunk_images: List[np.ndarray], top_k: int) -> List[TileCandidate]`
|
||||
|
||||
**Description**: Retrieves top-k candidate satellite tiles for a chunk using aggregate descriptor.
|
||||
|
||||
**Called By**:
|
||||
- F11 Failure Recovery Coordinator (chunk semantic matching)
|
||||
- F12 Route Chunk Manager (chunk matching coordination)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
chunk_images: List[np.ndarray] # 5-20 images from chunk
|
||||
top_k: int # Number of candidates (typically 5)
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
List[TileCandidate]:
|
||||
tile_id: str
|
||||
gps_center: GPSPoint
|
||||
similarity_score: float
|
||||
rank: int
|
||||
```
|
||||
|
||||
**Processing Flow**:
|
||||
1. compute_chunk_descriptor(chunk_images) → aggregate descriptor
|
||||
2. query_database(descriptor, top_k) → database_matches
|
||||
3. Retrieve tile metadata for matches
|
||||
4. rank_candidates() → sorted by similarity
|
||||
5. Return top-k candidates
|
||||
|
||||
**Advantages over Single-Image Matching**:
|
||||
- Aggregate descriptor more robust to featureless terrain
|
||||
- Multiple images provide more context
|
||||
- Better handles plain fields where single-image matching fails
|
||||
|
||||
**Test Cases**:
|
||||
1. **Chunk matching**: Returns relevant tiles
|
||||
2. **Featureless terrain**: Succeeds where single-image fails
|
||||
3. **Top-1 accuracy**: Correct tile in top-5 > 90% (better than single-image)
|
||||
|
||||
---
|
||||
|
||||
### `compute_chunk_descriptor(chunk_images: List[np.ndarray]) -> np.ndarray`
|
||||
|
||||
**Description**: Computes aggregate DINOv2 descriptor from multiple chunk images.
|
||||
|
||||
**Called By**:
|
||||
- Internal (during retrieve_candidate_tiles_for_chunk)
|
||||
- F12 Route Chunk Manager (chunk descriptor computation - delegates to F08)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
chunk_images: List[np.ndarray] # 5-20 images from chunk
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
np.ndarray: Aggregated descriptor vector (4096-dim or 8192-dim)
|
||||
```
|
||||
|
||||
**Algorithm**:
|
||||
1. For each image in chunk:
|
||||
- compute_location_descriptor(image) → descriptor (DINOv2 + VLAD)
|
||||
2. Aggregate descriptors:
|
||||
- **Mean aggregation**: Average all descriptors
|
||||
- **VLAD aggregation**: Use VLAD codebook for aggregation
|
||||
- **Max aggregation**: Element-wise maximum
|
||||
3. L2-normalize aggregated descriptor
|
||||
4. Return composite descriptor
|
||||
|
||||
**Aggregation Strategy**:
|
||||
- **Mean**: Simple average (default)
|
||||
- **VLAD**: More sophisticated, preserves spatial information
|
||||
- **Max**: Emphasizes strongest features
|
||||
|
||||
**Performance**:
|
||||
- Descriptor computation: ~150ms × N images (can be parallelized)
|
||||
- Aggregation: ~10ms
|
||||
|
||||
**Test Cases**:
|
||||
1. **Compute descriptor**: Returns aggregated descriptor
|
||||
2. **Multiple images**: Descriptor aggregates correctly
|
||||
3. **Descriptor quality**: More robust than single-image descriptor
|
||||
|
||||
## Integration Tests
|
||||
|
||||
### Test 1: Place Recognition Flow
|
||||
1. Load UAV image from sharp turn
|
||||
2. retrieve_candidate_tiles(top_k=5)
|
||||
3. Verify correct tile in top-5
|
||||
4. Pass candidates to F11 Failure Recovery
|
||||
|
||||
### Test 2: Season Invariance
|
||||
1. Satellite tiles from summer
|
||||
2. UAV images from autumn
|
||||
3. retrieve_candidate_tiles() → correct match despite appearance change
|
||||
|
||||
### Test 3: Index Loading
|
||||
1. Prepare pre-built index file from satellite provider
|
||||
2. load_index(index_path)
|
||||
3. Verify Faiss index loaded correctly
|
||||
4. Query with test image → returns matches
|
||||
|
||||
### Test 4: Chunk Semantic Matching
|
||||
1. Build chunk with 10 images (plain field scenario)
|
||||
2. compute_chunk_descriptor() → aggregate descriptor
|
||||
3. retrieve_candidate_tiles_for_chunk() → returns candidates
|
||||
4. Verify correct tile in top-5 (where single-image matching failed)
|
||||
5. Verify chunk matching more robust than single-image
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
### Performance
|
||||
- **retrieve_candidate_tiles**: < 200ms total
|
||||
- Descriptor computation: ~150ms
|
||||
- Database query: ~50ms
|
||||
- **compute_location_descriptor**: ~150ms
|
||||
- **query_database**: ~10-50ms
|
||||
|
||||
### Accuracy
|
||||
- **Recall@5**: > 85% (correct tile in top-5)
|
||||
- **Recall@1**: > 60% (correct tile is top-1)
|
||||
|
||||
### Scalability
|
||||
- Support 10,000+ satellite tiles in database
|
||||
- Fast query even with large database
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal Components
|
||||
- **F16 Model Manager**: For DINOv2 inference engine via `get_inference_engine("DINOv2")`.
|
||||
- **H04 Faiss Index Manager**: For similarity search via `load_index()`, `search()`. Critical for `query_database()`.
|
||||
- **F04 Satellite Data Manager**: For tile metadata retrieval after Faiss search returns tile indices.
|
||||
- **F12 Route Chunk Manager**: For chunk image retrieval during chunk descriptor computation.
|
||||
|
||||
### External Dependencies
|
||||
- **DINOv2**: Foundation vision model
|
||||
- **Faiss**: Similarity search library
|
||||
- **numpy**: Array operations
|
||||
|
||||
## Data Models
|
||||
|
||||
### TileCandidate
|
||||
```python
|
||||
class TileCandidate(BaseModel):
|
||||
tile_id: str
|
||||
gps_center: GPSPoint
|
||||
bounds: TileBounds
|
||||
similarity_score: float
|
||||
rank: int
|
||||
spatial_score: Optional[float]
|
||||
```
|
||||
|
||||
### DatabaseMatch
|
||||
```python
|
||||
class DatabaseMatch(BaseModel):
|
||||
index: int
|
||||
tile_id: str
|
||||
distance: float
|
||||
similarity_score: float
|
||||
```
|
||||
|
||||
### SatelliteTile
|
||||
```python
|
||||
class SatelliteTile(BaseModel):
|
||||
tile_id: str
|
||||
image: np.ndarray
|
||||
gps_center: GPSPoint
|
||||
bounds: TileBounds
|
||||
descriptor: Optional[np.ndarray]
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user