initial structure implemented

docs -> _docs
2026-06-22 15:41:12 +00:00 · 2025-12-01 14:20:56 +02:00
parent 9134c5db06
commit abc26d5c20
360 changed files with 3881 additions and 101 deletions
@@ -0,0 +1,55 @@
+# Feature: Single Image Alignment
+
+## Description
+
+Core UAV-to-satellite cross-view matching for individual frames using LiteSAM. Computes precise GPS coordinates by aligning a pre-rotated UAV image to a georeferenced satellite tile through homography estimation.
+
+## Component APIs Implemented
+
+- `align_to_satellite(uav_image, satellite_tile, tile_bounds) -> AlignmentResult`
+- `compute_homography(uav_image, satellite_tile) -> Optional[np.ndarray]`
+- `extract_gps_from_alignment(homography, tile_bounds, image_center) -> GPSPoint`
+- `compute_match_confidence(alignment) -> float`
+
+## External Tools and Services
+
+- **LiteSAM**: Cross-view matching model (TAIFormer encoder, CTM correlation)
+- **opencv-python**: RANSAC homography estimation, image operations
+- **numpy**: Matrix operations, coordinate transformations
+
+## Internal Methods
+
+| Method | Purpose |
+|--------|---------|
+| `_extract_features(image)` | Extract multi-scale features using LiteSAM TAIFormer encoder |
+| `_compute_correspondences(uav_features, sat_features)` | Compute dense correspondence field via CTM |
+| `_estimate_homography_ransac(correspondences)` | Estimate 3×3 homography using RANSAC |
+| `_refine_homography(homography, correspondences)` | Non-linear refinement of homography |
+| `_validate_match(homography, inliers)` | Check inlier count/ratio thresholds |
+| `_pixel_to_gps(pixel, tile_bounds)` | Convert satellite pixel coordinates to GPS |
+| `_compute_inlier_ratio(inliers, total)` | Calculate inlier ratio for confidence |
+| `_compute_spatial_distribution(inliers)` | Assess inlier spatial distribution quality |
+| `_compute_reprojection_error(homography, correspondences)` | Calculate mean reprojection error |
+
+## Unit Tests
+
+1. **Feature extraction**: LiteSAM encoder produces valid feature tensors
+2. **Correspondence computation**: CTM produces dense correspondence field
+3. **Homography estimation**: RANSAC returns valid 3×3 matrix for good correspondences
+4. **Homography estimation failure**: Returns None for insufficient correspondences (<15 inliers)
+5. **GPS extraction accuracy**: Pixel-to-GPS conversion within expected tolerance
+6. **Confidence high**: Returns >0.8 for inlier_ratio >0.6, inlier_count >50, MRE <0.5px
+7. **Confidence medium**: Returns 0.5-0.8 for moderate match quality
+8. **Confidence low**: Returns <0.5 for poor matches
+9. **Reprojection error calculation**: Correctly computes mean pixel error
+10. **Spatial distribution scoring**: Penalizes clustered inliers
+
+## Integration Tests
+
+1. **Single tile drift correction**: Load UAV image + satellite tile → align_to_satellite() returns GPS within 20m of ground truth
+2. **Progressive search (4 tiles)**: align_to_satellite() on 2×2 grid, first 3 fail, 4th succeeds
+3. **Rotation sensitivity**: Unrotated image (>45°) fails; pre-rotated image succeeds
+4. **Multi-scale robustness**: Different GSD (UAV 0.1m/px, satellite 0.3m/px) → match succeeds
+5. **Altitude variation**: UAV at various altitudes (<1km) → consistent GPS accuracy
+6. **Performance benchmark**: align_to_satellite() completes in ~60ms (TensorRT)
+
@@ -0,0 +1,51 @@
+# Feature: Chunk Alignment
+
+## Description
+
+Batch UAV-to-satellite matching that aggregates correspondences from multiple images in a chunk for more robust geo-localization. Handles scenarios where single-image matching fails (featureless terrain, partial occlusions). Returns Sim(3) transform for the entire chunk.
+
+## Component APIs Implemented
+
+- `align_chunk_to_satellite(chunk_images, satellite_tile, tile_bounds) -> ChunkAlignmentResult`
+- `match_chunk_homography(chunk_images, satellite_tile) -> Optional[np.ndarray]`
+
+## External Tools and Services
+
+- **LiteSAM**: Cross-view matching model (TAIFormer encoder, CTM correlation)
+- **opencv-python**: RANSAC homography estimation
+- **numpy**: Matrix operations, feature aggregation
+
+## Internal Methods
+
+| Method | Purpose |
+|--------|---------|
+| `_extract_chunk_features(chunk_images)` | Extract features from all chunk images |
+| `_aggregate_features(features_list)` | Combine features via mean/max pooling |
+| `_aggregate_correspondences(correspondences_list)` | Merge correspondences from multiple images |
+| `_estimate_chunk_homography(aggregated_correspondences)` | Estimate homography from aggregate data |
+| `_compute_sim3_transform(homography, tile_bounds)` | Extract translation, rotation, scale |
+| `_get_chunk_center_gps(homography, tile_bounds, chunk_images)` | GPS of middle frame center |
+| `_validate_chunk_match(inliers, confidence)` | Check chunk-specific thresholds (>30 inliers) |
+
+## Unit Tests
+
+1. **Feature aggregation**: Mean pooling produces valid combined features
+2. **Correspondence aggregation**: Merges correspondences from N images correctly
+3. **Chunk homography estimation**: Returns valid 3×3 matrix for aggregate correspondences
+4. **Chunk homography failure**: Returns None for insufficient aggregate correspondences
+5. **Sim(3) extraction**: Correctly decomposes homography into translation, rotation, scale
+6. **Chunk center GPS**: Returns GPS of middle frame's center pixel
+7. **Chunk confidence high**: Returns >0.7 for >50 inliers
+8. **Chunk confidence medium**: Returns 0.5-0.7 for 30-50 inliers
+9. **Chunk validation**: Rejects matches with <30 inliers
+
+## Integration Tests
+
+1. **Chunk LiteSAM matching**: 10 images from plain field → align_chunk_to_satellite() returns GPS within 20m
+2. **Chunk vs single-image robustness**: Featureless terrain where single-image fails, chunk succeeds
+3. **Chunk rotation sweeps**: Unknown orientation → try rotations (0°, 30°, ..., 330°) → match at correct angle
+4. **Sim(3) transform correctness**: Verify transform aligns chunk trajectory to satellite coordinates
+5. **Multi-scale chunk matching**: GSD mismatch handled correctly
+6. **Performance benchmark**: 10-image chunk alignment completes within acceptable time
+7. **Partial occlusion handling**: Some images occluded → chunk still matches successfully
+
@@ -0,0 +1,462 @@
+# Metric Refinement
+
+## Interface Definition
+
+**Interface Name**: `IMetricRefinement`
+
+### Interface Methods
+
+```python
+class IMetricRefinement(ABC):
+    @abstractmethod
+    def align_to_satellite(self, uav_image: np.ndarray, satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[AlignmentResult]:
+        pass
+    
+    @abstractmethod
+    def compute_homography(self, uav_image: np.ndarray, satellite_tile: np.ndarray) -> Optional[np.ndarray]:
+        pass
+    
+    @abstractmethod
+    def extract_gps_from_alignment(self, homography: np.ndarray, tile_bounds: TileBounds, image_center: Tuple[int, int]) -> GPSPoint:
+        pass
+    
+    @abstractmethod
+    def compute_match_confidence(self, alignment: AlignmentResult) -> float:
+        pass
+    
+    @abstractmethod
+    def align_chunk_to_satellite(self, chunk_images: List[np.ndarray], satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[ChunkAlignmentResult]:
+        pass
+    
+    @abstractmethod
+    def match_chunk_homography(self, chunk_images: List[np.ndarray], satellite_tile: np.ndarray) -> Optional[np.ndarray]:
+        pass
+```
+
+## Component Description
+
+### Responsibilities
+- LiteSAM for precise UAV-to-satellite cross-view matching
+- **Requires pre-rotated images** from Image Rotation Manager
+- Compute homography mapping UAV image to satellite tile
+- Extract absolute GPS coordinates from alignment
+- Process against single tile (drift correction) or tile grid (progressive search)
+- Achieve <20m accuracy requirement
+- **Chunk-to-satellite matching (more robust than single-image)**
+- **Chunk homography computation**
+
+### Scope
+- Cross-view geo-localization (UAV↔satellite)
+- Handles altitude variations (<1km)
+- Multi-scale processing for different GSDs
+- Domain gap (UAV downward vs satellite nadir view)
+- **Critical**: Fails if rotation >45° (handled by F06)
+- **Chunk-level matching (aggregate correspondences from multiple images)**
+
+## API Methods
+
+### `align_to_satellite(uav_image: np.ndarray, satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[AlignmentResult]`
+
+**Description**: Aligns UAV image to satellite tile, returning GPS location.
+
+**Called By**:
+- F06 Image Rotation Manager (during rotation sweep)
+- F11 Failure Recovery Coordinator (progressive search)
+- F02.2 Flight Processing Engine (drift correction with single tile)
+
+**Input**:
+```python
+uav_image: np.ndarray  # Pre-rotated UAV image
+satellite_tile: np.ndarray  # Reference satellite tile
+tile_bounds: TileBounds  # GPS bounds and GSD of the satellite tile
+```
+
+**Output**:
+```python
+AlignmentResult:
+    matched: bool
+    homography: np.ndarray  # 3×3 transformation matrix
+    gps_center: GPSPoint  # UAV image center GPS
+    confidence: float
+    inlier_count: int
+    total_correspondences: int
+```
+
+**Processing Flow**:
+1. Extract features from both images using LiteSAM encoder
+2. Compute dense correspondence field
+3. Estimate homography from correspondences
+4. Validate match quality (inlier count, reprojection error)
+5. If valid match:
+   - Extract GPS from homography using tile_bounds
+   - Return AlignmentResult
+6. If no match:
+   - Return None
+
+**Match Criteria**:
+- **Good match**: inlier_count > 30, confidence > 0.7
+- **Weak match**: inlier_count 15-30, confidence 0.5-0.7
+- **No match**: inlier_count < 15
+
+**Error Conditions**:
+- Returns `None`: No match found, rotation >45° (should be pre-rotated)
+
+**Test Cases**:
+1. **Good alignment**: Returns GPS within 20m of ground truth
+2. **Altitude variation**: Handles GSD mismatch
+3. **Rotation >45°**: Fails (by design, requires pre-rotation)
+4. **Multi-scale**: Processes at multiple scales
+
+---
+
+### `compute_homography(uav_image: np.ndarray, satellite_tile: np.ndarray) -> Optional[np.ndarray]`
+
+**Description**: Computes homography transformation from UAV to satellite.
+
+**Called By**:
+- Internal (during align_to_satellite)
+
+**Input**:
+```python
+uav_image: np.ndarray
+satellite_tile: np.ndarray
+```
+
+**Output**:
+```python
+Optional[np.ndarray]: 3×3 homography matrix or None
+```
+
+**Algorithm (LiteSAM)**:
+1. Extract multi-scale features using TAIFormer
+2. Compute correlation via Convolutional Token Mixer (CTM)
+3. Generate dense correspondences
+4. Estimate homography using RANSAC
+5. Refine with non-linear optimization
+
+**Homography Properties**:
+- Maps pixels from UAV image to satellite image
+- Accounts for: scale, rotation, perspective
+- 8 DoF (degrees of freedom)
+
+**Error Conditions**:
+- Returns `None`: Insufficient correspondences
+
+**Test Cases**:
+1. **Valid correspondence**: Returns 3×3 matrix
+2. **Insufficient features**: Returns None
+
+---
+
+### `extract_gps_from_alignment(homography: np.ndarray, tile_bounds: TileBounds, image_center: Tuple[int, int]) -> GPSPoint`
+
+**Description**: Extracts GPS coordinates from homography and tile georeferencing.
+
+**Called By**:
+- Internal (during align_to_satellite)
+- F06 Image Rotation Manager (for precise angle calculation)
+
+**Input**:
+```python
+homography: np.ndarray  # 3×3 matrix
+tile_bounds: TileBounds  # GPS bounds of satellite tile
+image_center: Tuple[int, int]  # Center pixel of UAV image
+```
+
+**Output**:
+```python
+GPSPoint:
+    lat: float
+    lon: float
+```
+
+**Algorithm**:
+1. Apply homography to UAV image center point
+2. Get pixel coordinates in satellite tile
+3. Convert satellite pixel to GPS using tile_bounds and GSD
+4. Return GPS coordinates
+
+**Uses**: tile_bounds parameter, H02 GSD Calculator
+
+**Test Cases**:
+1. **Center alignment**: UAV center → correct GPS
+2. **Corner alignment**: UAV corner → correct GPS
+3. **Multiple points**: All points consistent
+
+---
+
+### `compute_match_confidence(alignment: AlignmentResult) -> float`
+
+**Description**: Computes match confidence score from alignment quality.
+
+**Called By**:
+- Internal (during align_to_satellite)
+- F11 Failure Recovery Coordinator (to decide if match acceptable)
+
+**Input**:
+```python
+alignment: AlignmentResult
+```
+
+**Output**:
+```python
+float: Confidence score (0.0 to 1.0)
+```
+
+**Confidence Factors**:
+1. **Inlier ratio**: inliers / total_correspondences
+2. **Inlier count**: Absolute number of inliers
+3. **Reprojection error**: Mean error of inliers (in pixels)
+4. **Spatial distribution**: Inliers well-distributed vs clustered
+
+**Thresholds**:
+- **High confidence (>0.8)**: inlier_ratio > 0.6, inlier_count > 50, MRE < 0.5px
+- **Medium confidence (0.5-0.8)**: inlier_ratio > 0.4, inlier_count > 30
+- **Low confidence (<0.5)**: Reject match
+
+**Test Cases**:
+1. **Good match**: confidence > 0.8
+2. **Weak match**: confidence 0.5-0.7
+3. **Poor match**: confidence < 0.5
+
+---
+
+### `align_chunk_to_satellite(chunk_images: List[np.ndarray], satellite_tile: np.ndarray, tile_bounds: TileBounds) -> Optional[ChunkAlignmentResult]`
+
+**Description**: Aligns entire chunk to satellite tile, returning GPS location.
+
+**Called By**:
+- F06 Image Rotation Manager (during chunk rotation sweep)
+- F11 Failure Recovery Coordinator (chunk LiteSAM matching)
+
+**Input**:
+```python
+chunk_images: List[np.ndarray]  # Pre-rotated chunk images (5-20 images)
+satellite_tile: np.ndarray  # Reference satellite tile
+tile_bounds: TileBounds  # GPS bounds and GSD of the satellite tile
+```
+
+**Output**:
+```python
+ChunkAlignmentResult:
+    matched: bool
+    chunk_id: str
+    chunk_center_gps: GPSPoint  # GPS of chunk center (middle frame)
+    rotation_angle: float
+    confidence: float
+    inlier_count: int
+    transform: Sim3Transform
+```
+
+**Processing Flow**:
+1. For each image in chunk:
+   - Extract features using LiteSAM encoder
+   - Compute correspondences with satellite tile
+2. Aggregate correspondences from all images
+3. Estimate homography from aggregate correspondences
+4. Validate match quality (inlier count, reprojection error)
+5. If valid match:
+   - Extract GPS from chunk center using tile_bounds
+   - Compute Sim(3) transform (translation, rotation, scale)
+   - Return ChunkAlignmentResult
+6. If no match:
+   - Return None
+
+**Match Criteria**:
+- **Good match**: inlier_count > 50, confidence > 0.7
+- **Weak match**: inlier_count 30-50, confidence 0.5-0.7
+- **No match**: inlier_count < 30
+
+**Advantages over Single-Image Matching**:
+- More correspondences (aggregate from multiple images)
+- More robust to featureless terrain
+- Better handles partial occlusions
+- Higher confidence scores
+
+**Test Cases**:
+1. **Chunk alignment**: Returns GPS within 20m of ground truth
+2. **Featureless terrain**: Succeeds where single-image fails
+3. **Rotation >45°**: Fails (requires pre-rotation via F06)
+4. **Multi-scale**: Handles GSD mismatch
+
+---
+
+### `match_chunk_homography(chunk_images: List[np.ndarray], satellite_tile: np.ndarray) -> Optional[np.ndarray]`
+
+**Description**: Computes homography transformation from chunk to satellite.
+
+**Called By**:
+- Internal (during align_chunk_to_satellite)
+
+**Input**:
+```python
+chunk_images: List[np.ndarray]
+satellite_tile: np.ndarray
+```
+
+**Output**:
+```python
+Optional[np.ndarray]: 3×3 homography matrix or None
+```
+
+**Algorithm (LiteSAM)**:
+1. Extract multi-scale features from all chunk images using TAIFormer
+2. Aggregate features (mean or max pooling)
+3. Compute correlation via Convolutional Token Mixer (CTM)
+4. Generate dense correspondences
+5. Estimate homography using RANSAC
+6. Refine with non-linear optimization
+
+**Homography Properties**:
+- Maps pixels from chunk center to satellite image
+- Accounts for: scale, rotation, perspective
+- 8 DoF (degrees of freedom)
+
+**Test Cases**:
+1. **Valid correspondence**: Returns 3×3 matrix
+2. **Insufficient features**: Returns None
+3. **Aggregate correspondences**: More robust than single-image
+
+## Integration Tests
+
+### Test 1: Single Tile Drift Correction
+1. Load UAV image and expected satellite tile
+2. Pre-rotate UAV image to known heading
+3. align_to_satellite() → returns GPS
+4. Verify GPS within 20m of ground truth
+
+### Test 2: Progressive Search (4 tiles)
+1. Load UAV image from sharp turn
+2. Get 2×2 tile grid from F04
+3. align_to_satellite() for each tile (with tile_bounds)
+4. First 3 tiles: No match
+5. 4th tile: Match found → GPS extracted
+
+### Test 3: Rotation Sensitivity
+1. Rotate UAV image by 60° (not pre-rotated)
+2. align_to_satellite() → returns None (fails as expected)
+3. Pre-rotate to 60°
+4. align_to_satellite() → succeeds
+
+### Test 4: Multi-Scale Robustness
+1. UAV at 500m altitude (GSD=0.1m/pixel)
+2. Satellite at zoom 19 (GSD=0.3m/pixel)
+3. LiteSAM handles scale difference → match succeeds
+
+### Test 5: Chunk LiteSAM Matching
+1. Build chunk with 10 images (plain field scenario)
+2. Pre-rotate chunk to known heading
+3. align_chunk_to_satellite() → returns GPS
+4. Verify GPS within 20m of ground truth
+5. Verify chunk matching more robust than single-image
+
+### Test 6: Chunk Rotation Sweeps
+1. Build chunk with unknown orientation
+2. Try chunk rotation steps (0°, 30°, ..., 330°)
+3. align_chunk_to_satellite() for each rotation
+4. Match found at 120° → GPS extracted
+5. Verify Sim(3) transform computed correctly
+
+## Non-Functional Requirements
+
+### Performance
+- **align_to_satellite**: ~60ms per tile (TensorRT optimized)
+- **Progressive search 25 tiles**: ~1.5 seconds total (25 × 60ms)
+- Meets <5s per frame requirement
+
+### Accuracy
+- **GPS accuracy**: 60% of frames < 20m error, 80% < 50m error
+- **Mean Reprojection Error (MRE)**: < 1.0 pixels
+- **Alignment success rate**: > 95% when rotation correct
+
+### Reliability
+- Graceful failure when no match
+- Robust to altitude variations (<1km)
+- Handles seasonal appearance changes (to extent possible)
+
+## Dependencies
+
+### Internal Components
+- **F12 Route Chunk Manager**: For chunk image retrieval and chunk operations
+- **F16 Model Manager**: For LiteSAM model
+- **H01 Camera Model**: For projection operations
+- **H02 GSD Calculator**: For coordinate transformations
+- **H05 Performance Monitor**: For timing
+
+**Critical Dependency on F06 Image Rotation Manager**:
+- F09 requires pre-rotated images (rotation <45° from north)
+- Caller (F06 or F11) must pre-rotate images using F06.rotate_image_360() before calling F09.align_to_satellite()
+- If rotation >45°, F09 will fail to match (by design)
+- F06 handles the rotation sweep (trying 0°, 30°, 60°, etc.) and calls F09 for each rotation
+
+**Note**: tile_bounds is passed as parameter from caller (F02.2 Flight Processing Engine gets it from F04 Satellite Data Manager)
+
+### External Dependencies
+- **LiteSAM**: Cross-view matching model
+- **opencv-python**: Homography estimation
+- **numpy**: Matrix operations
+
+## Data Models
+
+### AlignmentResult
+```python
+class AlignmentResult(BaseModel):
+    matched: bool
+    homography: np.ndarray  # (3, 3)
+    gps_center: GPSPoint
+    confidence: float
+    inlier_count: int
+    total_correspondences: int
+    reprojection_error: float  # Mean error in pixels
+```
+
+### GPSPoint
+```python
+class GPSPoint(BaseModel):
+    lat: float
+    lon: float
+```
+
+### TileBounds
+```python
+class TileBounds(BaseModel):
+    nw: GPSPoint
+    ne: GPSPoint
+    sw: GPSPoint
+    se: GPSPoint
+    center: GPSPoint
+    gsd: float  # Ground Sampling Distance (m/pixel)
+```
+
+### LiteSAMConfig
+```python
+class LiteSAMConfig(BaseModel):
+    model_path: str
+    confidence_threshold: float = 0.7
+    min_inliers: int = 15
+    max_reprojection_error: float = 2.0  # pixels
+    multi_scale_levels: int = 3
+    chunk_min_inliers: int = 30  # Higher threshold for chunk matching
+```
+
+### ChunkAlignmentResult
+```python
+class ChunkAlignmentResult(BaseModel):
+    matched: bool
+    chunk_id: str
+    chunk_center_gps: GPSPoint
+    rotation_angle: float
+    confidence: float
+    inlier_count: int
+    transform: Sim3Transform  # Translation, rotation, scale
+    reprojection_error: float  # Mean error in pixels
+```
+
+### Sim3Transform
+```python
+class Sim3Transform(BaseModel):
+    translation: np.ndarray  # (3,) - translation vector
+    rotation: np.ndarray  # (3, 3) rotation matrix or (4,) quaternion
+    scale: float  # Scale factor
+```
+