# Metric Refinement ## Interface Definition **Interface Name**: `IMetricRefinement` ### Interface Methods ```python class IMetricRefinement(ABC): @abstractmethod def align_to_satellite(self, uav_image: np.ndarray, satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[AlignmentResult]: pass @abstractmethod def compute_homography(self, uav_image: np.ndarray, satellite_tile: np.ndarray) -> Optional[np.ndarray]: pass @abstractmethod def extract_gps_from_alignment(self, homography: np.ndarray, tile_bounds: TileBounds, image_center: Tuple[int, int]) -> GPSPoint: pass @abstractmethod def compute_match_confidence(self, alignment: AlignmentResult) -> float: pass @abstractmethod def align_chunk_to_satellite(self, chunk_images: List[np.ndarray], satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[ChunkAlignmentResult]: pass @abstractmethod def match_chunk_homography(self, chunk_images: List[np.ndarray], satellite_tile: np.ndarray) -> Optional[np.ndarray]: pass ``` ## Component Description ### Responsibilities - LiteSAM for precise UAV-to-satellite cross-view matching - **Requires pre-rotated images** from Image Rotation Manager - Compute homography mapping UAV image to satellite tile - Extract absolute GPS coordinates from alignment - Process against single tile (drift correction) or tile grid (progressive search) - Achieve <20m accuracy requirement - **Chunk-to-satellite matching (more robust than single-image)** - **Chunk homography computation** ### Scope - Cross-view geo-localization (UAV↔satellite) - Handles altitude variations (<1km) - Multi-scale processing for different GSDs - Domain gap (UAV downward vs satellite nadir view) - **Critical**: Fails if rotation >45° (handled by G06) - **Chunk-level matching (aggregate correspondences from multiple images)** ## API Methods ### `align_to_satellite(uav_image: np.ndarray, satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[AlignmentResult]` **Description**: Aligns UAV image to satellite tile, returning GPS location. **Called By**: - F06 Image Rotation Manager (during rotation sweep) - F11 Failure Recovery Coordinator (progressive search) - F02 Flight Processor (drift correction with single tile) **Input**: ```python uav_image: np.ndarray # Pre-rotated UAV image satellite_tile: np.ndarray # Reference satellite tile tile_bounds: TileBounds # GPS bounds and GSD of the satellite tile ``` **Output**: ```python AlignmentResult: matched: bool homography: np.ndarray # 3×3 transformation matrix gps_center: GPSPoint # UAV image center GPS confidence: float inlier_count: int total_correspondences: int ``` **Processing Flow**: 1. Extract features from both images using LiteSAM encoder 2. Compute dense correspondence field 3. Estimate homography from correspondences 4. Validate match quality (inlier count, reprojection error) 5. If valid match: - Extract GPS from homography using tile_bounds - Return AlignmentResult 6. If no match: - Return None **Match Criteria**: - **Good match**: inlier_count > 30, confidence > 0.7 - **Weak match**: inlier_count 15-30, confidence 0.5-0.7 - **No match**: inlier_count < 15 **Error Conditions**: - Returns `None`: No match found, rotation >45° (should be pre-rotated) **Test Cases**: 1. **Good alignment**: Returns GPS within 20m of ground truth 2. **Altitude variation**: Handles GSD mismatch 3. **Rotation >45°**: Fails (by design, requires pre-rotation) 4. **Multi-scale**: Processes at multiple scales --- ### `compute_homography(uav_image: np.ndarray, satellite_tile: np.ndarray) -> Optional[np.ndarray]` **Description**: Computes homography transformation from UAV to satellite. **Called By**: - Internal (during align_to_satellite) **Input**: ```python uav_image: np.ndarray satellite_tile: np.ndarray ``` **Output**: ```python Optional[np.ndarray]: 3×3 homography matrix or None ``` **Algorithm (LiteSAM)**: 1. Extract multi-scale features using TAIFormer 2. Compute correlation via Convolutional Token Mixer (CTM) 3. Generate dense correspondences 4. Estimate homography using RANSAC 5. Refine with non-linear optimization **Homography Properties**: - Maps pixels from UAV image to satellite image - Accounts for: scale, rotation, perspective - 8 DoF (degrees of freedom) **Error Conditions**: - Returns `None`: Insufficient correspondences **Test Cases**: 1. **Valid correspondence**: Returns 3×3 matrix 2. **Insufficient features**: Returns None --- ### `extract_gps_from_alignment(homography: np.ndarray, tile_bounds: TileBounds, image_center: Tuple[int, int]) -> GPSPoint` **Description**: Extracts GPS coordinates from homography and tile georeferencing. **Called By**: - Internal (during align_to_satellite) - F06 Image Rotation Manager (for precise angle calculation) **Input**: ```python homography: np.ndarray # 3×3 matrix tile_bounds: TileBounds # GPS bounds of satellite tile image_center: Tuple[int, int] # Center pixel of UAV image ``` **Output**: ```python GPSPoint: lat: float lon: float ``` **Algorithm**: 1. Apply homography to UAV image center point 2. Get pixel coordinates in satellite tile 3. Convert satellite pixel to GPS using tile_bounds and GSD 4. Return GPS coordinates **Uses**: tile_bounds parameter, H02 GSD Calculator **Test Cases**: 1. **Center alignment**: UAV center → correct GPS 2. **Corner alignment**: UAV corner → correct GPS 3. **Multiple points**: All points consistent --- ### `compute_match_confidence(alignment: AlignmentResult) -> float` **Description**: Computes match confidence score from alignment quality. **Called By**: - Internal (during align_to_satellite) - F11 Failure Recovery Coordinator (to decide if match acceptable) **Input**: ```python alignment: AlignmentResult ``` **Output**: ```python float: Confidence score (0.0 to 1.0) ``` **Confidence Factors**: 1. **Inlier ratio**: inliers / total_correspondences 2. **Inlier count**: Absolute number of inliers 3. **Reprojection error**: Mean error of inliers (in pixels) 4. **Spatial distribution**: Inliers well-distributed vs clustered **Thresholds**: - **High confidence (>0.8)**: inlier_ratio > 0.6, inlier_count > 50, MRE < 0.5px - **Medium confidence (0.5-0.8)**: inlier_ratio > 0.4, inlier_count > 30 - **Low confidence (<0.5)**: Reject match **Test Cases**: 1. **Good match**: confidence > 0.8 2. **Weak match**: confidence 0.5-0.7 3. **Poor match**: confidence < 0.5 --- ### `align_chunk_to_satellite(chunk_images: List[np.ndarray], satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[ChunkAlignmentResult]` **Description**: Aligns entire chunk to satellite tile, returning GPS location. **Called By**: - F06 Image Rotation Manager (during chunk rotation sweep) - F11 Failure Recovery Coordinator (chunk LiteSAM matching) **Input**: ```python chunk_images: List[np.ndarray] # Pre-rotated chunk images (5-20 images) satellite_tile: np.ndarray # Reference satellite tile tile_bounds: TileBounds # GPS bounds and GSD of the satellite tile ``` **Output**: ```python ChunkAlignmentResult: matched: bool chunk_id: str chunk_center_gps: GPSPoint # GPS of chunk center (middle frame) rotation_angle: float confidence: float inlier_count: int transform: Sim3Transform ``` **Processing Flow**: 1. For each image in chunk: - Extract features using LiteSAM encoder - Compute correspondences with satellite tile 2. Aggregate correspondences from all images 3. Estimate homography from aggregate correspondences 4. Validate match quality (inlier count, reprojection error) 5. If valid match: - Extract GPS from chunk center using tile_bounds - Compute Sim(3) transform (translation, rotation, scale) - Return ChunkAlignmentResult 6. If no match: - Return None **Match Criteria**: - **Good match**: inlier_count > 50, confidence > 0.7 - **Weak match**: inlier_count 30-50, confidence 0.5-0.7 - **No match**: inlier_count < 30 **Advantages over Single-Image Matching**: - More correspondences (aggregate from multiple images) - More robust to featureless terrain - Better handles partial occlusions - Higher confidence scores **Test Cases**: 1. **Chunk alignment**: Returns GPS within 20m of ground truth 2. **Featureless terrain**: Succeeds where single-image fails 3. **Rotation >45°**: Fails (requires pre-rotation via F06) 4. **Multi-scale**: Handles GSD mismatch --- ### `match_chunk_homography(chunk_images: List[np.ndarray], satellite_tile: np.ndarray) -> Optional[np.ndarray]` **Description**: Computes homography transformation from chunk to satellite. **Called By**: - Internal (during align_chunk_to_satellite) **Input**: ```python chunk_images: List[np.ndarray] satellite_tile: np.ndarray ``` **Output**: ```python Optional[np.ndarray]: 3×3 homography matrix or None ``` **Algorithm (LiteSAM)**: 1. Extract multi-scale features from all chunk images using TAIFormer 2. Aggregate features (mean or max pooling) 3. Compute correlation via Convolutional Token Mixer (CTM) 4. Generate dense correspondences 5. Estimate homography using RANSAC 6. Refine with non-linear optimization **Homography Properties**: - Maps pixels from chunk center to satellite image - Accounts for: scale, rotation, perspective - 8 DoF (degrees of freedom) **Test Cases**: 1. **Valid correspondence**: Returns 3×3 matrix 2. **Insufficient features**: Returns None 3. **Aggregate correspondences**: More robust than single-image ## Integration Tests ### Test 1: Single Tile Drift Correction 1. Load UAV image and expected satellite tile 2. Pre-rotate UAV image to known heading 3. align_to_satellite() → returns GPS 4. Verify GPS within 20m of ground truth ### Test 2: Progressive Search (4 tiles) 1. Load UAV image from sharp turn 2. Get 2×2 tile grid from F04 3. align_to_satellite() for each tile (with tile_bounds) 4. First 3 tiles: No match 5. 4th tile: Match found → GPS extracted ### Test 3: Rotation Sensitivity 1. Rotate UAV image by 60° (not pre-rotated) 2. align_to_satellite() → returns None (fails as expected) 3. Pre-rotate to 60° 4. align_to_satellite() → succeeds ### Test 4: Multi-Scale Robustness 1. UAV at 500m altitude (GSD=0.1m/pixel) 2. Satellite at zoom 19 (GSD=0.3m/pixel) 3. LiteSAM handles scale difference → match succeeds ### Test 5: Chunk LiteSAM Matching 1. Build chunk with 10 images (plain field scenario) 2. Pre-rotate chunk to known heading 3. align_chunk_to_satellite() → returns GPS 4. Verify GPS within 20m of ground truth 5. Verify chunk matching more robust than single-image ### Test 6: Chunk Rotation Sweeps 1. Build chunk with unknown orientation 2. Try chunk rotation steps (0°, 30°, ..., 330°) 3. align_chunk_to_satellite() for each rotation 4. Match found at 120° → GPS extracted 5. Verify Sim(3) transform computed correctly ## Non-Functional Requirements ### Performance - **align_to_satellite**: ~60ms per tile (TensorRT optimized) - **Progressive search 25 tiles**: ~1.5 seconds total (25 × 60ms) - Meets <5s per frame requirement ### Accuracy - **GPS accuracy**: 60% of frames < 20m error, 80% < 50m error - **Mean Reprojection Error (MRE)**: < 1.0 pixels - **Alignment success rate**: > 95% when rotation correct ### Reliability - Graceful failure when no match - Robust to altitude variations (<1km) - Handles seasonal appearance changes (to extent possible) ## Dependencies ### Internal Components - **F16 Model Manager**: For LiteSAM model - **H01 Camera Model**: For projection operations - **H02 GSD Calculator**: For coordinate transformations - **H05 Performance Monitor**: For timing - **F12 Route Chunk Manager**: For chunk image retrieval **Note**: tile_bounds is passed as parameter from caller (F02 Flight Processor gets it from F04 Satellite Data Manager) ### External Dependencies - **LiteSAM**: Cross-view matching model - **opencv-python**: Homography estimation - **numpy**: Matrix operations ## Data Models ### AlignmentResult ```python class AlignmentResult(BaseModel): matched: bool homography: np.ndarray # (3, 3) gps_center: GPSPoint confidence: float inlier_count: int total_correspondences: int reprojection_error: float # Mean error in pixels ``` ### GPSPoint ```python class GPSPoint(BaseModel): lat: float lon: float ``` ### TileBounds ```python class TileBounds(BaseModel): nw: GPSPoint ne: GPSPoint sw: GPSPoint se: GPSPoint center: GPSPoint gsd: float # Ground Sampling Distance (m/pixel) ``` ### LiteSAMConfig ```python class LiteSAMConfig(BaseModel): model_path: str confidence_threshold: float = 0.7 min_inliers: int = 15 max_reprojection_error: float = 2.0 # pixels multi_scale_levels: int = 3 chunk_min_inliers: int = 30 # Higher threshold for chunk matching ``` ### ChunkAlignmentResult ```python class ChunkAlignmentResult(BaseModel): matched: bool chunk_id: str chunk_center_gps: GPSPoint rotation_angle: float confidence: float inlier_count: int transform: Sim3Transform # Translation, rotation, scale reprojection_error: float # Mean error in pixels ``` ### Sim3Transform ```python class Sim3Transform(BaseModel): translation: np.ndarray # (3,) - translation vector rotation: np.ndarray # (3, 3) rotation matrix or (4,) quaternion scale: float # Scale factor ```