component decomposition is done

2026-06-22 03:31:13 +00:00 · 2025-11-24 14:09:23 +02:00
parent acec83018b
commit f50006d100
34 changed files with 8637 additions and 0 deletions
@@ -0,0 +1,316 @@
+# Sequential Visual Odometry
+
+## Interface Definition
+
+**Interface Name**: `ISequentialVO`
+
+### Interface Methods
+
+```python
+class ISequentialVO(ABC):
+    @abstractmethod
+    def compute_relative_pose(self, prev_image: np.ndarray, curr_image: np.ndarray) -> Optional[RelativePose]:
+        pass
+    
+    @abstractmethod
+    def extract_features(self, image: np.ndarray) -> Features:
+        pass
+    
+    @abstractmethod
+    def match_features(self, features1: Features, features2: Features) -> Matches:
+        pass
+    
+    @abstractmethod
+    def estimate_motion(self, matches: Matches, camera_params: CameraParameters) -> Optional[Motion]:
+        pass
+```
+
+## Component Description
+
+### Responsibilities
+- SuperPoint feature extraction from UAV images
+- LightGlue feature matching between consecutive frames
+- Handle <5% overlap scenarios
+- Estimate relative pose (translation + rotation) between frames
+- Return relative pose factors for Factor Graph Optimizer
+- Detect tracking loss (low inlier count)
+
+### Scope
+- Frame-to-frame visual odometry
+- Feature-based motion estimation
+- Handles low overlap and challenging agricultural environments
+- Provides relative measurements for trajectory optimization
+
+## API Methods
+
+### `compute_relative_pose(prev_image: np.ndarray, curr_image: np.ndarray) -> Optional[RelativePose]`
+
+**Description**: Computes relative camera pose between consecutive frames.
+
+**Called By**:
+- Main processing loop (per-frame)
+
+**Input**:
+```python
+prev_image: np.ndarray  # Previous frame (t-1)
+curr_image: np.ndarray  # Current frame (t)
+```
+
+**Output**:
+```python
+RelativePose:
+    translation: np.ndarray  # (x, y, z) in meters
+    rotation: np.ndarray  # 3×3 rotation matrix or quaternion
+    confidence: float  # 0.0 to 1.0
+    inlier_count: int
+    total_matches: int
+    tracking_good: bool
+```
+
+**Processing Flow**:
+1. extract_features(prev_image) → features1
+2. extract_features(curr_image) → features2
+3. match_features(features1, features2) → matches
+4. estimate_motion(matches, camera_params) → motion
+5. Return RelativePose
+
+**Tracking Quality Indicators**:
+- **Good tracking**: inlier_count > 50, inlier_ratio > 0.5
+- **Degraded tracking**: inlier_count 20-50
+- **Tracking loss**: inlier_count < 20
+
+**Error Conditions**:
+- Returns `None`: Tracking lost (insufficient matches)
+
+**Test Cases**:
+1. **Good overlap (>50%)**: Returns reliable pose
+2. **Low overlap (5-10%)**: Still succeeds with LightGlue
+3. **<5% overlap**: May return None (tracking loss)
+4. **Agricultural texture**: Handles repetitive patterns
+
+---
+
+### `extract_features(image: np.ndarray) -> Features`
+
+**Description**: Extracts SuperPoint keypoints and descriptors from image.
+
+**Called By**:
+- Internal (during compute_relative_pose)
+- G08 Global Place Recognition (for descriptor caching)
+
+**Input**:
+```python
+image: np.ndarray  # Input image (H×W×3 or H×W)
+```
+
+**Output**:
+```python
+Features:
+    keypoints: np.ndarray  # (N, 2) - (x, y) coordinates
+    descriptors: np.ndarray  # (N, 256) - 256-dim descriptors
+    scores: np.ndarray  # (N,) - detection confidence scores
+```
+
+**Processing Details**:
+- Uses G15 Model Manager to get SuperPoint model
+- Converts to grayscale if needed
+- Non-maximum suppression for keypoint selection
+- Typically extracts 500-2000 keypoints per image
+
+**Performance**:
+- Inference time: ~15ms with TensorRT on RTX 2060
+
+**Error Conditions**:
+- Never fails (returns empty features if image invalid)
+
+**Test Cases**:
+1. **FullHD image**: Extracts ~1000 keypoints
+2. **High-res image (6252×4168)**: Extracts ~2000 keypoints
+3. **Low-texture image**: Extracts fewer keypoints
+
+---
+
+### `match_features(features1: Features, features2: Features) -> Matches`
+
+**Description**: Matches features using LightGlue attention-based matcher.
+
+**Called By**:
+- Internal (during compute_relative_pose)
+
+**Input**:
+```python
+features1: Features  # Previous frame features
+features2: Features  # Current frame features
+```
+
+**Output**:
+```python
+Matches:
+    matches: np.ndarray  # (M, 2) - indices [idx1, idx2]
+    scores: np.ndarray  # (M,) - match confidence scores
+    keypoints1: np.ndarray  # (M, 2) - matched keypoints from frame 1
+    keypoints2: np.ndarray  # (M, 2) - matched keypoints from frame 2
+```
+
+**Processing Details**:
+- Uses G15 Model Manager to get LightGlue model
+- Transformer-based attention mechanism
+- "Dustbin" mechanism for unmatched features
+- Adaptive depth (exits early for easy matches)
+- **Critical**: Handles <5% overlap better than RANSAC
+
+**Performance**:
+- Inference time: ~35-100ms (adaptive depth)
+- Faster for high-overlap, slower for low-overlap
+
+**Test Cases**:
+1. **High overlap**: Fast matching (~35ms), 500+ matches
+2. **Low overlap (<5%)**: Slower (~100ms), 20-50 matches
+3. **No overlap**: Few or no matches (< 10)
+
+---
+
+### `estimate_motion(matches: Matches, camera_params: CameraParameters) -> Optional[Motion]`
+
+**Description**: Estimates camera motion from matched keypoints using Essential Matrix.
+
+**Called By**:
+- Internal (during compute_relative_pose)
+
+**Input**:
+```python
+matches: Matches
+camera_params: CameraParameters:
+    focal_length: float
+    principal_point: Tuple[float, float]
+    resolution: Tuple[int, int]
+```
+
+**Output**:
+```python
+Motion:
+    translation: np.ndarray  # (x, y, z) - unit vector (scale ambiguous)
+    rotation: np.ndarray  # 3×3 rotation matrix
+    inliers: np.ndarray  # Boolean mask of inlier matches
+    inlier_count: int
+```
+
+**Algorithm**:
+1. Normalize keypoint coordinates using camera intrinsics
+2. Estimate Essential Matrix using RANSAC
+3. Decompose Essential Matrix → [R, t]
+4. Return motion with inlier mask
+
+**Scale Ambiguity**:
+- Monocular VO has inherent scale ambiguity
+- Translation is unit vector (direction only)
+- Scale resolved by:
+  - Altitude prior (from G10 Factor Graph)
+  - Absolute GPS measurements (from G09 LiteSAM)
+
+**Error Conditions**:
+- Returns `None`: Insufficient inliers (< 8 points for Essential Matrix)
+
+**Test Cases**:
+1. **Good matches**: Returns motion with high inlier count
+2. **Low inliers**: May return None
+3. **Degenerate motion**: Handles pure rotation
+
+## Integration Tests
+
+### Test 1: Normal Flight Sequence
+1. Load consecutive frames with 50% overlap
+2. compute_relative_pose() → returns valid pose
+3. Verify translation direction reasonable
+4. Verify inlier_count > 100
+
+### Test 2: Low Overlap Scenario
+1. Load frames with 5% overlap
+2. compute_relative_pose() → still succeeds
+3. Verify inlier_count > 20
+4. Verify LightGlue finds matches despite low overlap
+
+### Test 3: Tracking Loss
+1. Load frames with 0% overlap (sharp turn)
+2. compute_relative_pose() → returns None
+3. Verify tracking_good = False
+4. Trigger global place recognition
+
+### Test 4: Agricultural Texture
+1. Load images of wheat fields (repetitive texture)
+2. compute_relative_pose() → SuperPoint handles better than SIFT
+3. Verify match quality
+
+## Non-Functional Requirements
+
+### Performance
+- **compute_relative_pose**: < 200ms total
+  - SuperPoint extraction: ~15ms × 2 = 30ms
+  - LightGlue matching: ~50ms
+  - Motion estimation: ~10ms
+- **Frame rate**: 5-10 FPS processing (meets <5s requirement)
+
+### Accuracy
+- **Relative rotation**: ±2° error
+- **Relative translation direction**: ±5° error
+- **Inlier ratio**: >50% for good tracking
+
+### Reliability
+- Handle 100m spacing between frames
+- Survive temporary tracking degradation
+- Recover from brief occlusions
+
+## Dependencies
+
+### Internal Components
+- **G15 Model Manager**: For SuperPoint and LightGlue models
+- **G16 Configuration Manager**: For camera parameters
+- **H01 Camera Model**: For coordinate normalization
+- **H05 Performance Monitor**: For timing measurements
+
+### External Dependencies
+- **SuperPoint**: Feature extraction model
+- **LightGlue**: Feature matching model
+- **opencv-python**: Essential Matrix estimation
+- **numpy**: Matrix operations
+
+## Data Models
+
+### Features
+```python
+class Features(BaseModel):
+    keypoints: np.ndarray  # (N, 2)
+    descriptors: np.ndarray  # (N, 256)
+    scores: np.ndarray  # (N,)
+```
+
+### Matches
+```python
+class Matches(BaseModel):
+    matches: np.ndarray  # (M, 2) - pairs of indices
+    scores: np.ndarray  # (M,) - match confidence
+    keypoints1: np.ndarray  # (M, 2)
+    keypoints2: np.ndarray  # (M, 2)
+```
+
+### RelativePose
+```python
+class RelativePose(BaseModel):
+    translation: np.ndarray  # (3,) - unit vector
+    rotation: np.ndarray  # (3, 3) or (4,) quaternion
+    confidence: float
+    inlier_count: int
+    total_matches: int
+    tracking_good: bool
+    scale_ambiguous: bool = True
+```
+
+### Motion
+```python
+class Motion(BaseModel):
+    translation: np.ndarray  # (3,)
+    rotation: np.ndarray  # (3, 3)
+    inliers: np.ndarray  # Boolean mask
+    inlier_count: int
+```
+