13 KiB
Metric Refinement
Interface Definition
Interface Name: IMetricRefinement
Interface Methods
class IMetricRefinement(ABC):
@abstractmethod
def align_to_satellite(self, uav_image: np.ndarray, satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[AlignmentResult]:
pass
@abstractmethod
def compute_homography(self, uav_image: np.ndarray, satellite_tile: np.ndarray) -> Optional[np.ndarray]:
pass
@abstractmethod
def extract_gps_from_alignment(self, homography: np.ndarray, tile_bounds: TileBounds, image_center: Tuple[int, int]) -> GPSPoint:
pass
@abstractmethod
def compute_match_confidence(self, alignment: AlignmentResult) -> float:
pass
@abstractmethod
def align_chunk_to_satellite(self, chunk_images: List[np.ndarray], satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[ChunkAlignmentResult]:
pass
@abstractmethod
def match_chunk_homography(self, chunk_images: List[np.ndarray], satellite_tile: np.ndarray) -> Optional[np.ndarray]:
pass
Component Description
Responsibilities
- LiteSAM for precise UAV-to-satellite cross-view matching
- Requires pre-rotated images from Image Rotation Manager
- Compute homography mapping UAV image to satellite tile
- Extract absolute GPS coordinates from alignment
- Process against single tile (drift correction) or tile grid (progressive search)
- Achieve <20m accuracy requirement
- Chunk-to-satellite matching (more robust than single-image)
- Chunk homography computation
Scope
- Cross-view geo-localization (UAV↔satellite)
- Handles altitude variations (<1km)
- Multi-scale processing for different GSDs
- Domain gap (UAV downward vs satellite nadir view)
- Critical: Fails if rotation >45° (handled by G06)
- Chunk-level matching (aggregate correspondences from multiple images)
API Methods
align_to_satellite(uav_image: np.ndarray, satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[AlignmentResult]
Description: Aligns UAV image to satellite tile, returning GPS location.
Called By:
- F06 Image Rotation Manager (during rotation sweep)
- F11 Failure Recovery Coordinator (progressive search)
- F02 Flight Processor (drift correction with single tile)
Input:
uav_image: np.ndarray # Pre-rotated UAV image
satellite_tile: np.ndarray # Reference satellite tile
tile_bounds: TileBounds # GPS bounds and GSD of the satellite tile
Output:
AlignmentResult:
matched: bool
homography: np.ndarray # 3×3 transformation matrix
gps_center: GPSPoint # UAV image center GPS
confidence: float
inlier_count: int
total_correspondences: int
Processing Flow:
- Extract features from both images using LiteSAM encoder
- Compute dense correspondence field
- Estimate homography from correspondences
- Validate match quality (inlier count, reprojection error)
- If valid match:
- Extract GPS from homography using tile_bounds
- Return AlignmentResult
- If no match:
- Return None
Match Criteria:
- Good match: inlier_count > 30, confidence > 0.7
- Weak match: inlier_count 15-30, confidence 0.5-0.7
- No match: inlier_count < 15
Error Conditions:
- Returns
None: No match found, rotation >45° (should be pre-rotated)
Test Cases:
- Good alignment: Returns GPS within 20m of ground truth
- Altitude variation: Handles GSD mismatch
- Rotation >45°: Fails (by design, requires pre-rotation)
- Multi-scale: Processes at multiple scales
compute_homography(uav_image: np.ndarray, satellite_tile: np.ndarray) -> Optional[np.ndarray]
Description: Computes homography transformation from UAV to satellite.
Called By:
- Internal (during align_to_satellite)
Input:
uav_image: np.ndarray
satellite_tile: np.ndarray
Output:
Optional[np.ndarray]: 3×3 homography matrix or None
Algorithm (LiteSAM):
- Extract multi-scale features using TAIFormer
- Compute correlation via Convolutional Token Mixer (CTM)
- Generate dense correspondences
- Estimate homography using RANSAC
- Refine with non-linear optimization
Homography Properties:
- Maps pixels from UAV image to satellite image
- Accounts for: scale, rotation, perspective
- 8 DoF (degrees of freedom)
Error Conditions:
- Returns
None: Insufficient correspondences
Test Cases:
- Valid correspondence: Returns 3×3 matrix
- Insufficient features: Returns None
extract_gps_from_alignment(homography: np.ndarray, tile_bounds: TileBounds, image_center: Tuple[int, int]) -> GPSPoint
Description: Extracts GPS coordinates from homography and tile georeferencing.
Called By:
- Internal (during align_to_satellite)
- F06 Image Rotation Manager (for precise angle calculation)
Input:
homography: np.ndarray # 3×3 matrix
tile_bounds: TileBounds # GPS bounds of satellite tile
image_center: Tuple[int, int] # Center pixel of UAV image
Output:
GPSPoint:
lat: float
lon: float
Algorithm:
- Apply homography to UAV image center point
- Get pixel coordinates in satellite tile
- Convert satellite pixel to GPS using tile_bounds and GSD
- Return GPS coordinates
Uses: tile_bounds parameter, H02 GSD Calculator
Test Cases:
- Center alignment: UAV center → correct GPS
- Corner alignment: UAV corner → correct GPS
- Multiple points: All points consistent
compute_match_confidence(alignment: AlignmentResult) -> float
Description: Computes match confidence score from alignment quality.
Called By:
- Internal (during align_to_satellite)
- F11 Failure Recovery Coordinator (to decide if match acceptable)
Input:
alignment: AlignmentResult
Output:
float: Confidence score (0.0 to 1.0)
Confidence Factors:
- Inlier ratio: inliers / total_correspondences
- Inlier count: Absolute number of inliers
- Reprojection error: Mean error of inliers (in pixels)
- Spatial distribution: Inliers well-distributed vs clustered
Thresholds:
- High confidence (>0.8): inlier_ratio > 0.6, inlier_count > 50, MRE < 0.5px
- Medium confidence (0.5-0.8): inlier_ratio > 0.4, inlier_count > 30
- Low confidence (<0.5): Reject match
Test Cases:
- Good match: confidence > 0.8
- Weak match: confidence 0.5-0.7
- Poor match: confidence < 0.5
align_chunk_to_satellite(chunk_images: List[np.ndarray], satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[ChunkAlignmentResult]
Description: Aligns entire chunk to satellite tile, returning GPS location.
Called By:
- F06 Image Rotation Manager (during chunk rotation sweep)
- F11 Failure Recovery Coordinator (chunk LiteSAM matching)
Input:
chunk_images: List[np.ndarray] # Pre-rotated chunk images (5-20 images)
satellite_tile: np.ndarray # Reference satellite tile
tile_bounds: TileBounds # GPS bounds and GSD of the satellite tile
Output:
ChunkAlignmentResult:
matched: bool
chunk_id: str
chunk_center_gps: GPSPoint # GPS of chunk center (middle frame)
rotation_angle: float
confidence: float
inlier_count: int
transform: Sim3Transform
Processing Flow:
- For each image in chunk:
- Extract features using LiteSAM encoder
- Compute correspondences with satellite tile
- Aggregate correspondences from all images
- Estimate homography from aggregate correspondences
- Validate match quality (inlier count, reprojection error)
- If valid match:
- Extract GPS from chunk center using tile_bounds
- Compute Sim(3) transform (translation, rotation, scale)
- Return ChunkAlignmentResult
- If no match:
- Return None
Match Criteria:
- Good match: inlier_count > 50, confidence > 0.7
- Weak match: inlier_count 30-50, confidence 0.5-0.7
- No match: inlier_count < 30
Advantages over Single-Image Matching:
- More correspondences (aggregate from multiple images)
- More robust to featureless terrain
- Better handles partial occlusions
- Higher confidence scores
Test Cases:
- Chunk alignment: Returns GPS within 20m of ground truth
- Featureless terrain: Succeeds where single-image fails
- Rotation >45°: Fails (requires pre-rotation via F06)
- Multi-scale: Handles GSD mismatch
match_chunk_homography(chunk_images: List[np.ndarray], satellite_tile: np.ndarray) -> Optional[np.ndarray]
Description: Computes homography transformation from chunk to satellite.
Called By:
- Internal (during align_chunk_to_satellite)
Input:
chunk_images: List[np.ndarray]
satellite_tile: np.ndarray
Output:
Optional[np.ndarray]: 3×3 homography matrix or None
Algorithm (LiteSAM):
- Extract multi-scale features from all chunk images using TAIFormer
- Aggregate features (mean or max pooling)
- Compute correlation via Convolutional Token Mixer (CTM)
- Generate dense correspondences
- Estimate homography using RANSAC
- Refine with non-linear optimization
Homography Properties:
- Maps pixels from chunk center to satellite image
- Accounts for: scale, rotation, perspective
- 8 DoF (degrees of freedom)
Test Cases:
- Valid correspondence: Returns 3×3 matrix
- Insufficient features: Returns None
- Aggregate correspondences: More robust than single-image
Integration Tests
Test 1: Single Tile Drift Correction
- Load UAV image and expected satellite tile
- Pre-rotate UAV image to known heading
- align_to_satellite() → returns GPS
- Verify GPS within 20m of ground truth
Test 2: Progressive Search (4 tiles)
- Load UAV image from sharp turn
- Get 2×2 tile grid from F04
- align_to_satellite() for each tile (with tile_bounds)
- First 3 tiles: No match
- 4th tile: Match found → GPS extracted
Test 3: Rotation Sensitivity
- Rotate UAV image by 60° (not pre-rotated)
- align_to_satellite() → returns None (fails as expected)
- Pre-rotate to 60°
- align_to_satellite() → succeeds
Test 4: Multi-Scale Robustness
- UAV at 500m altitude (GSD=0.1m/pixel)
- Satellite at zoom 19 (GSD=0.3m/pixel)
- LiteSAM handles scale difference → match succeeds
Test 5: Chunk LiteSAM Matching
- Build chunk with 10 images (plain field scenario)
- Pre-rotate chunk to known heading
- align_chunk_to_satellite() → returns GPS
- Verify GPS within 20m of ground truth
- Verify chunk matching more robust than single-image
Test 6: Chunk Rotation Sweeps
- Build chunk with unknown orientation
- Try chunk rotation steps (0°, 30°, ..., 330°)
- align_chunk_to_satellite() for each rotation
- Match found at 120° → GPS extracted
- Verify Sim(3) transform computed correctly
Non-Functional Requirements
Performance
- align_to_satellite: ~60ms per tile (TensorRT optimized)
- Progressive search 25 tiles: ~1.5 seconds total (25 × 60ms)
- Meets <5s per frame requirement
Accuracy
- GPS accuracy: 60% of frames < 20m error, 80% < 50m error
- Mean Reprojection Error (MRE): < 1.0 pixels
- Alignment success rate: > 95% when rotation correct
Reliability
- Graceful failure when no match
- Robust to altitude variations (<1km)
- Handles seasonal appearance changes (to extent possible)
Dependencies
Internal Components
- F16 Model Manager: For LiteSAM model
- H01 Camera Model: For projection operations
- H02 GSD Calculator: For coordinate transformations
- H05 Performance Monitor: For timing
- F12 Route Chunk Manager: For chunk image retrieval
Note: tile_bounds is passed as parameter from caller (F02 Flight Processor gets it from F04 Satellite Data Manager)
External Dependencies
- LiteSAM: Cross-view matching model
- opencv-python: Homography estimation
- numpy: Matrix operations
Data Models
AlignmentResult
class AlignmentResult(BaseModel):
matched: bool
homography: np.ndarray # (3, 3)
gps_center: GPSPoint
confidence: float
inlier_count: int
total_correspondences: int
reprojection_error: float # Mean error in pixels
GPSPoint
class GPSPoint(BaseModel):
lat: float
lon: float
TileBounds
class TileBounds(BaseModel):
nw: GPSPoint
ne: GPSPoint
sw: GPSPoint
se: GPSPoint
center: GPSPoint
gsd: float # Ground Sampling Distance (m/pixel)
LiteSAMConfig
class LiteSAMConfig(BaseModel):
model_path: str
confidence_threshold: float = 0.7
min_inliers: int = 15
max_reprojection_error: float = 2.0 # pixels
multi_scale_levels: int = 3
chunk_min_inliers: int = 30 # Higher threshold for chunk matching
ChunkAlignmentResult
class ChunkAlignmentResult(BaseModel):
matched: bool
chunk_id: str
chunk_center_gps: GPSPoint
rotation_angle: float
confidence: float
inlier_count: int
transform: Sim3Transform # Translation, rotation, scale
reprojection_error: float # Mean error in pixels
Sim3Transform
class Sim3Transform(BaseModel):
translation: np.ndarray # (3,) - translation vector
rotation: np.ndarray # (3, 3) rotation matrix or (4,) quaternion
scale: float # Scale factor