add chunking

2026-04-22 11:36:36 +00:00 · 2025-11-27 03:43:19 +02:00
parent 4f8c18a066
commit 2037870f67
43 changed files with 7041 additions and 4135 deletions
@@ -0,0 +1,456 @@
+# Metric Refinement
+
+## Interface Definition
+
+**Interface Name**: `IMetricRefinement`
+
+### Interface Methods
+
+```python
+class IMetricRefinement(ABC):
+    @abstractmethod
+    def align_to_satellite(self, uav_image: np.ndarray, satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[AlignmentResult]:
+        pass
+    
+    @abstractmethod
+    def compute_homography(self, uav_image: np.ndarray, satellite_tile: np.ndarray) -> Optional[np.ndarray]:
+        pass
+    
+    @abstractmethod
+    def extract_gps_from_alignment(self, homography: np.ndarray, tile_bounds: TileBounds, image_center: Tuple[int, int]) -> GPSPoint:
+        pass
+    
+    @abstractmethod
+    def compute_match_confidence(self, alignment: AlignmentResult) -> float:
+        pass
+    
+    @abstractmethod
+    def align_chunk_to_satellite(self, chunk_images: List[np.ndarray], satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[ChunkAlignmentResult]:
+        pass
+    
+    @abstractmethod
+    def match_chunk_homography(self, chunk_images: List[np.ndarray], satellite_tile: np.ndarray) -> Optional[np.ndarray]:
+        pass
+```
+
+## Component Description
+
+### Responsibilities
+- LiteSAM for precise UAV-to-satellite cross-view matching
+- **Requires pre-rotated images** from Image Rotation Manager
+- Compute homography mapping UAV image to satellite tile
+- Extract absolute GPS coordinates from alignment
+- Process against single tile (drift correction) or tile grid (progressive search)
+- Achieve <20m accuracy requirement
+- **Chunk-to-satellite matching (more robust than single-image)**
+- **Chunk homography computation**
+
+### Scope
+- Cross-view geo-localization (UAV↔satellite)
+- Handles altitude variations (<1km)
+- Multi-scale processing for different GSDs
+- Domain gap (UAV downward vs satellite nadir view)
+- **Critical**: Fails if rotation >45° (handled by G06)
+- **Chunk-level matching (aggregate correspondences from multiple images)**
+
+## API Methods
+
+### `align_to_satellite(uav_image: np.ndarray, satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[AlignmentResult]`
+
+**Description**: Aligns UAV image to satellite tile, returning GPS location.
+
+**Called By**:
+- F06 Image Rotation Manager (during rotation sweep)
+- F11 Failure Recovery Coordinator (progressive search)
+- F02 Flight Processor (drift correction with single tile)
+
+**Input**:
+```python
+uav_image: np.ndarray  # Pre-rotated UAV image
+satellite_tile: np.ndarray  # Reference satellite tile
+tile_bounds: TileBounds  # GPS bounds and GSD of the satellite tile
+```
+
+**Output**:
+```python
+AlignmentResult:
+    matched: bool
+    homography: np.ndarray  # 3×3 transformation matrix
+    gps_center: GPSPoint  # UAV image center GPS
+    confidence: float
+    inlier_count: int
+    total_correspondences: int
+```
+
+**Processing Flow**:
+1. Extract features from both images using LiteSAM encoder
+2. Compute dense correspondence field
+3. Estimate homography from correspondences
+4. Validate match quality (inlier count, reprojection error)
+5. If valid match:
+   - Extract GPS from homography using tile_bounds
+   - Return AlignmentResult
+6. If no match:
+   - Return None
+
+**Match Criteria**:
+- **Good match**: inlier_count > 30, confidence > 0.7
+- **Weak match**: inlier_count 15-30, confidence 0.5-0.7
+- **No match**: inlier_count < 15
+
+**Error Conditions**:
+- Returns `None`: No match found, rotation >45° (should be pre-rotated)
+
+**Test Cases**:
+1. **Good alignment**: Returns GPS within 20m of ground truth
+2. **Altitude variation**: Handles GSD mismatch
+3. **Rotation >45°**: Fails (by design, requires pre-rotation)
+4. **Multi-scale**: Processes at multiple scales
+
+---
+
+### `compute_homography(uav_image: np.ndarray, satellite_tile: np.ndarray) -> Optional[np.ndarray]`
+
+**Description**: Computes homography transformation from UAV to satellite.
+
+**Called By**:
+- Internal (during align_to_satellite)
+
+**Input**:
+```python
+uav_image: np.ndarray
+satellite_tile: np.ndarray
+```
+
+**Output**:
+```python
+Optional[np.ndarray]: 3×3 homography matrix or None
+```
+
+**Algorithm (LiteSAM)**:
+1. Extract multi-scale features using TAIFormer
+2. Compute correlation via Convolutional Token Mixer (CTM)
+3. Generate dense correspondences
+4. Estimate homography using RANSAC
+5. Refine with non-linear optimization
+
+**Homography Properties**:
+- Maps pixels from UAV image to satellite image
+- Accounts for: scale, rotation, perspective
+- 8 DoF (degrees of freedom)
+
+**Error Conditions**:
+- Returns `None`: Insufficient correspondences
+
+**Test Cases**:
+1. **Valid correspondence**: Returns 3×3 matrix
+2. **Insufficient features**: Returns None
+
+---
+
+### `extract_gps_from_alignment(homography: np.ndarray, tile_bounds: TileBounds, image_center: Tuple[int, int]) -> GPSPoint`
+
+**Description**: Extracts GPS coordinates from homography and tile georeferencing.
+
+**Called By**:
+- Internal (during align_to_satellite)
+- F06 Image Rotation Manager (for precise angle calculation)
+
+**Input**:
+```python
+homography: np.ndarray  # 3×3 matrix
+tile_bounds: TileBounds  # GPS bounds of satellite tile
+image_center: Tuple[int, int]  # Center pixel of UAV image
+```
+
+**Output**:
+```python
+GPSPoint:
+    lat: float
+    lon: float
+```
+
+**Algorithm**:
+1. Apply homography to UAV image center point
+2. Get pixel coordinates in satellite tile
+3. Convert satellite pixel to GPS using tile_bounds and GSD
+4. Return GPS coordinates
+
+**Uses**: tile_bounds parameter, H02 GSD Calculator
+
+**Test Cases**:
+1. **Center alignment**: UAV center → correct GPS
+2. **Corner alignment**: UAV corner → correct GPS
+3. **Multiple points**: All points consistent
+
+---
+
+### `compute_match_confidence(alignment: AlignmentResult) -> float`
+
+**Description**: Computes match confidence score from alignment quality.
+
+**Called By**:
+- Internal (during align_to_satellite)
+- F11 Failure Recovery Coordinator (to decide if match acceptable)
+
+**Input**:
+```python
+alignment: AlignmentResult
+```
+
+**Output**:
+```python
+float: Confidence score (0.0 to 1.0)
+```
+
+**Confidence Factors**:
+1. **Inlier ratio**: inliers / total_correspondences
+2. **Inlier count**: Absolute number of inliers
+3. **Reprojection error**: Mean error of inliers (in pixels)
+4. **Spatial distribution**: Inliers well-distributed vs clustered
+
+**Thresholds**:
+- **High confidence (>0.8)**: inlier_ratio > 0.6, inlier_count > 50, MRE < 0.5px
+- **Medium confidence (0.5-0.8)**: inlier_ratio > 0.4, inlier_count > 30
+- **Low confidence (<0.5)**: Reject match
+
+**Test Cases**:
+1. **Good match**: confidence > 0.8
+2. **Weak match**: confidence 0.5-0.7
+3. **Poor match**: confidence < 0.5
+
+---
+
+### `align_chunk_to_satellite(chunk_images: List[np.ndarray], satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[ChunkAlignmentResult]`
+
+**Description**: Aligns entire chunk to satellite tile, returning GPS location.
+
+**Called By**:
+- F06 Image Rotation Manager (during chunk rotation sweep)
+- F11 Failure Recovery Coordinator (chunk LiteSAM matching)
+
+**Input**:
+```python
+chunk_images: List[np.ndarray]  # Pre-rotated chunk images (5-20 images)
+satellite_tile: np.ndarray  # Reference satellite tile
+tile_bounds: TileBounds  # GPS bounds and GSD of the satellite tile
+```
+
+**Output**:
+```python
+ChunkAlignmentResult:
+    matched: bool
+    chunk_id: str
+    chunk_center_gps: GPSPoint  # GPS of chunk center (middle frame)
+    rotation_angle: float
+    confidence: float
+    inlier_count: int
+    transform: Sim3Transform
+```
+
+**Processing Flow**:
+1. For each image in chunk:
+   - Extract features using LiteSAM encoder
+   - Compute correspondences with satellite tile
+2. Aggregate correspondences from all images
+3. Estimate homography from aggregate correspondences
+4. Validate match quality (inlier count, reprojection error)
+5. If valid match:
+   - Extract GPS from chunk center using tile_bounds
+   - Compute Sim(3) transform (translation, rotation, scale)
+   - Return ChunkAlignmentResult
+6. If no match:
+   - Return None
+
+**Match Criteria**:
+- **Good match**: inlier_count > 50, confidence > 0.7
+- **Weak match**: inlier_count 30-50, confidence 0.5-0.7
+- **No match**: inlier_count < 30
+
+**Advantages over Single-Image Matching**:
+- More correspondences (aggregate from multiple images)
+- More robust to featureless terrain
+- Better handles partial occlusions
+- Higher confidence scores
+
+**Test Cases**:
+1. **Chunk alignment**: Returns GPS within 20m of ground truth
+2. **Featureless terrain**: Succeeds where single-image fails
+3. **Rotation >45°**: Fails (requires pre-rotation via F06)
+4. **Multi-scale**: Handles GSD mismatch
+
+---
+
+### `match_chunk_homography(chunk_images: List[np.ndarray], satellite_tile: np.ndarray) -> Optional[np.ndarray]`
+
+**Description**: Computes homography transformation from chunk to satellite.
+
+**Called By**:
+- Internal (during align_chunk_to_satellite)
+
+**Input**:
+```python
+chunk_images: List[np.ndarray]
+satellite_tile: np.ndarray
+```
+
+**Output**:
+```python
+Optional[np.ndarray]: 3×3 homography matrix or None
+```
+
+**Algorithm (LiteSAM)**:
+1. Extract multi-scale features from all chunk images using TAIFormer
+2. Aggregate features (mean or max pooling)
+3. Compute correlation via Convolutional Token Mixer (CTM)
+4. Generate dense correspondences
+5. Estimate homography using RANSAC
+6. Refine with non-linear optimization
+
+**Homography Properties**:
+- Maps pixels from chunk center to satellite image
+- Accounts for: scale, rotation, perspective
+- 8 DoF (degrees of freedom)
+
+**Test Cases**:
+1. **Valid correspondence**: Returns 3×3 matrix
+2. **Insufficient features**: Returns None
+3. **Aggregate correspondences**: More robust than single-image
+
+## Integration Tests
+
+### Test 1: Single Tile Drift Correction
+1. Load UAV image and expected satellite tile
+2. Pre-rotate UAV image to known heading
+3. align_to_satellite() → returns GPS
+4. Verify GPS within 20m of ground truth
+
+### Test 2: Progressive Search (4 tiles)
+1. Load UAV image from sharp turn
+2. Get 2×2 tile grid from F04
+3. align_to_satellite() for each tile (with tile_bounds)
+4. First 3 tiles: No match
+5. 4th tile: Match found → GPS extracted
+
+### Test 3: Rotation Sensitivity
+1. Rotate UAV image by 60° (not pre-rotated)
+2. align_to_satellite() → returns None (fails as expected)
+3. Pre-rotate to 60°
+4. align_to_satellite() → succeeds
+
+### Test 4: Multi-Scale Robustness
+1. UAV at 500m altitude (GSD=0.1m/pixel)
+2. Satellite at zoom 19 (GSD=0.3m/pixel)
+3. LiteSAM handles scale difference → match succeeds
+
+### Test 5: Chunk LiteSAM Matching
+1. Build chunk with 10 images (plain field scenario)
+2. Pre-rotate chunk to known heading
+3. align_chunk_to_satellite() → returns GPS
+4. Verify GPS within 20m of ground truth
+5. Verify chunk matching more robust than single-image
+
+### Test 6: Chunk Rotation Sweeps
+1. Build chunk with unknown orientation
+2. Try chunk rotation steps (0°, 30°, ..., 330°)
+3. align_chunk_to_satellite() for each rotation
+4. Match found at 120° → GPS extracted
+5. Verify Sim(3) transform computed correctly
+
+## Non-Functional Requirements
+
+### Performance
+- **align_to_satellite**: ~60ms per tile (TensorRT optimized)
+- **Progressive search 25 tiles**: ~1.5 seconds total (25 × 60ms)
+- Meets <5s per frame requirement
+
+### Accuracy
+- **GPS accuracy**: 60% of frames < 20m error, 80% < 50m error
+- **Mean Reprojection Error (MRE)**: < 1.0 pixels
+- **Alignment success rate**: > 95% when rotation correct
+
+### Reliability
+- Graceful failure when no match
+- Robust to altitude variations (<1km)
+- Handles seasonal appearance changes (to extent possible)
+
+## Dependencies
+
+### Internal Components
+- **F16 Model Manager**: For LiteSAM model
+- **H01 Camera Model**: For projection operations
+- **H02 GSD Calculator**: For coordinate transformations
+- **H05 Performance Monitor**: For timing
+- **F12 Route Chunk Manager**: For chunk image retrieval
+
+**Note**: tile_bounds is passed as parameter from caller (F02 Flight Processor gets it from F04 Satellite Data Manager)
+
+### External Dependencies
+- **LiteSAM**: Cross-view matching model
+- **opencv-python**: Homography estimation
+- **numpy**: Matrix operations
+
+## Data Models
+
+### AlignmentResult
+```python
+class AlignmentResult(BaseModel):
+    matched: bool
+    homography: np.ndarray  # (3, 3)
+    gps_center: GPSPoint
+    confidence: float
+    inlier_count: int
+    total_correspondences: int
+    reprojection_error: float  # Mean error in pixels
+```
+
+### GPSPoint
+```python
+class GPSPoint(BaseModel):
+    lat: float
+    lon: float
+```
+
+### TileBounds
+```python
+class TileBounds(BaseModel):
+    nw: GPSPoint
+    ne: GPSPoint
+    sw: GPSPoint
+    se: GPSPoint
+    center: GPSPoint
+    gsd: float  # Ground Sampling Distance (m/pixel)
+```
+
+### LiteSAMConfig
+```python
+class LiteSAMConfig(BaseModel):
+    model_path: str
+    confidence_threshold: float = 0.7
+    min_inliers: int = 15
+    max_reprojection_error: float = 2.0  # pixels
+    multi_scale_levels: int = 3
+    chunk_min_inliers: int = 30  # Higher threshold for chunk matching
+```
+
+### ChunkAlignmentResult
+```python
+class ChunkAlignmentResult(BaseModel):
+    matched: bool
+    chunk_id: str
+    chunk_center_gps: GPSPoint
+    rotation_angle: float
+    confidence: float
+    inlier_count: int
+    transform: Sim3Transform  # Translation, rotation, scale
+    reprojection_error: float  # Mean error in pixels
+```
+
+### Sim3Transform
+```python
+class Sim3Transform(BaseModel):
+    translation: np.ndarray  # (3,) - translation vector
+    rotation: np.ndarray  # (3, 3) rotation matrix or (4,) quaternion
+    scale: float  # Scale factor
+```
+