mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-04-22 22:26:38 +00:00
component decomposition is done
This commit is contained in:
+316
@@ -0,0 +1,316 @@
|
||||
# Sequential Visual Odometry
|
||||
|
||||
## Interface Definition
|
||||
|
||||
**Interface Name**: `ISequentialVO`
|
||||
|
||||
### Interface Methods
|
||||
|
||||
```python
|
||||
class ISequentialVO(ABC):
|
||||
@abstractmethod
|
||||
def compute_relative_pose(self, prev_image: np.ndarray, curr_image: np.ndarray) -> Optional[RelativePose]:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def extract_features(self, image: np.ndarray) -> Features:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def match_features(self, features1: Features, features2: Features) -> Matches:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def estimate_motion(self, matches: Matches, camera_params: CameraParameters) -> Optional[Motion]:
|
||||
pass
|
||||
```
|
||||
|
||||
## Component Description
|
||||
|
||||
### Responsibilities
|
||||
- SuperPoint feature extraction from UAV images
|
||||
- LightGlue feature matching between consecutive frames
|
||||
- Handle <5% overlap scenarios
|
||||
- Estimate relative pose (translation + rotation) between frames
|
||||
- Return relative pose factors for Factor Graph Optimizer
|
||||
- Detect tracking loss (low inlier count)
|
||||
|
||||
### Scope
|
||||
- Frame-to-frame visual odometry
|
||||
- Feature-based motion estimation
|
||||
- Handles low overlap and challenging agricultural environments
|
||||
- Provides relative measurements for trajectory optimization
|
||||
|
||||
## API Methods
|
||||
|
||||
### `compute_relative_pose(prev_image: np.ndarray, curr_image: np.ndarray) -> Optional[RelativePose]`
|
||||
|
||||
**Description**: Computes relative camera pose between consecutive frames.
|
||||
|
||||
**Called By**:
|
||||
- Main processing loop (per-frame)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
prev_image: np.ndarray # Previous frame (t-1)
|
||||
curr_image: np.ndarray # Current frame (t)
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
RelativePose:
|
||||
translation: np.ndarray # (x, y, z) in meters
|
||||
rotation: np.ndarray # 3×3 rotation matrix or quaternion
|
||||
confidence: float # 0.0 to 1.0
|
||||
inlier_count: int
|
||||
total_matches: int
|
||||
tracking_good: bool
|
||||
```
|
||||
|
||||
**Processing Flow**:
|
||||
1. extract_features(prev_image) → features1
|
||||
2. extract_features(curr_image) → features2
|
||||
3. match_features(features1, features2) → matches
|
||||
4. estimate_motion(matches, camera_params) → motion
|
||||
5. Return RelativePose
|
||||
|
||||
**Tracking Quality Indicators**:
|
||||
- **Good tracking**: inlier_count > 50, inlier_ratio > 0.5
|
||||
- **Degraded tracking**: inlier_count 20-50
|
||||
- **Tracking loss**: inlier_count < 20
|
||||
|
||||
**Error Conditions**:
|
||||
- Returns `None`: Tracking lost (insufficient matches)
|
||||
|
||||
**Test Cases**:
|
||||
1. **Good overlap (>50%)**: Returns reliable pose
|
||||
2. **Low overlap (5-10%)**: Still succeeds with LightGlue
|
||||
3. **<5% overlap**: May return None (tracking loss)
|
||||
4. **Agricultural texture**: Handles repetitive patterns
|
||||
|
||||
---
|
||||
|
||||
### `extract_features(image: np.ndarray) -> Features`
|
||||
|
||||
**Description**: Extracts SuperPoint keypoints and descriptors from image.
|
||||
|
||||
**Called By**:
|
||||
- Internal (during compute_relative_pose)
|
||||
- G08 Global Place Recognition (for descriptor caching)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
image: np.ndarray # Input image (H×W×3 or H×W)
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
Features:
|
||||
keypoints: np.ndarray # (N, 2) - (x, y) coordinates
|
||||
descriptors: np.ndarray # (N, 256) - 256-dim descriptors
|
||||
scores: np.ndarray # (N,) - detection confidence scores
|
||||
```
|
||||
|
||||
**Processing Details**:
|
||||
- Uses G15 Model Manager to get SuperPoint model
|
||||
- Converts to grayscale if needed
|
||||
- Non-maximum suppression for keypoint selection
|
||||
- Typically extracts 500-2000 keypoints per image
|
||||
|
||||
**Performance**:
|
||||
- Inference time: ~15ms with TensorRT on RTX 2060
|
||||
|
||||
**Error Conditions**:
|
||||
- Never fails (returns empty features if image invalid)
|
||||
|
||||
**Test Cases**:
|
||||
1. **FullHD image**: Extracts ~1000 keypoints
|
||||
2. **High-res image (6252×4168)**: Extracts ~2000 keypoints
|
||||
3. **Low-texture image**: Extracts fewer keypoints
|
||||
|
||||
---
|
||||
|
||||
### `match_features(features1: Features, features2: Features) -> Matches`
|
||||
|
||||
**Description**: Matches features using LightGlue attention-based matcher.
|
||||
|
||||
**Called By**:
|
||||
- Internal (during compute_relative_pose)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
features1: Features # Previous frame features
|
||||
features2: Features # Current frame features
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
Matches:
|
||||
matches: np.ndarray # (M, 2) - indices [idx1, idx2]
|
||||
scores: np.ndarray # (M,) - match confidence scores
|
||||
keypoints1: np.ndarray # (M, 2) - matched keypoints from frame 1
|
||||
keypoints2: np.ndarray # (M, 2) - matched keypoints from frame 2
|
||||
```
|
||||
|
||||
**Processing Details**:
|
||||
- Uses G15 Model Manager to get LightGlue model
|
||||
- Transformer-based attention mechanism
|
||||
- "Dustbin" mechanism for unmatched features
|
||||
- Adaptive depth (exits early for easy matches)
|
||||
- **Critical**: Handles <5% overlap better than RANSAC
|
||||
|
||||
**Performance**:
|
||||
- Inference time: ~35-100ms (adaptive depth)
|
||||
- Faster for high-overlap, slower for low-overlap
|
||||
|
||||
**Test Cases**:
|
||||
1. **High overlap**: Fast matching (~35ms), 500+ matches
|
||||
2. **Low overlap (<5%)**: Slower (~100ms), 20-50 matches
|
||||
3. **No overlap**: Few or no matches (< 10)
|
||||
|
||||
---
|
||||
|
||||
### `estimate_motion(matches: Matches, camera_params: CameraParameters) -> Optional[Motion]`
|
||||
|
||||
**Description**: Estimates camera motion from matched keypoints using Essential Matrix.
|
||||
|
||||
**Called By**:
|
||||
- Internal (during compute_relative_pose)
|
||||
|
||||
**Input**:
|
||||
```python
|
||||
matches: Matches
|
||||
camera_params: CameraParameters:
|
||||
focal_length: float
|
||||
principal_point: Tuple[float, float]
|
||||
resolution: Tuple[int, int]
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```python
|
||||
Motion:
|
||||
translation: np.ndarray # (x, y, z) - unit vector (scale ambiguous)
|
||||
rotation: np.ndarray # 3×3 rotation matrix
|
||||
inliers: np.ndarray # Boolean mask of inlier matches
|
||||
inlier_count: int
|
||||
```
|
||||
|
||||
**Algorithm**:
|
||||
1. Normalize keypoint coordinates using camera intrinsics
|
||||
2. Estimate Essential Matrix using RANSAC
|
||||
3. Decompose Essential Matrix → [R, t]
|
||||
4. Return motion with inlier mask
|
||||
|
||||
**Scale Ambiguity**:
|
||||
- Monocular VO has inherent scale ambiguity
|
||||
- Translation is unit vector (direction only)
|
||||
- Scale resolved by:
|
||||
- Altitude prior (from G10 Factor Graph)
|
||||
- Absolute GPS measurements (from G09 LiteSAM)
|
||||
|
||||
**Error Conditions**:
|
||||
- Returns `None`: Insufficient inliers (< 8 points for Essential Matrix)
|
||||
|
||||
**Test Cases**:
|
||||
1. **Good matches**: Returns motion with high inlier count
|
||||
2. **Low inliers**: May return None
|
||||
3. **Degenerate motion**: Handles pure rotation
|
||||
|
||||
## Integration Tests
|
||||
|
||||
### Test 1: Normal Flight Sequence
|
||||
1. Load consecutive frames with 50% overlap
|
||||
2. compute_relative_pose() → returns valid pose
|
||||
3. Verify translation direction reasonable
|
||||
4. Verify inlier_count > 100
|
||||
|
||||
### Test 2: Low Overlap Scenario
|
||||
1. Load frames with 5% overlap
|
||||
2. compute_relative_pose() → still succeeds
|
||||
3. Verify inlier_count > 20
|
||||
4. Verify LightGlue finds matches despite low overlap
|
||||
|
||||
### Test 3: Tracking Loss
|
||||
1. Load frames with 0% overlap (sharp turn)
|
||||
2. compute_relative_pose() → returns None
|
||||
3. Verify tracking_good = False
|
||||
4. Trigger global place recognition
|
||||
|
||||
### Test 4: Agricultural Texture
|
||||
1. Load images of wheat fields (repetitive texture)
|
||||
2. compute_relative_pose() → SuperPoint handles better than SIFT
|
||||
3. Verify match quality
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
### Performance
|
||||
- **compute_relative_pose**: < 200ms total
|
||||
- SuperPoint extraction: ~15ms × 2 = 30ms
|
||||
- LightGlue matching: ~50ms
|
||||
- Motion estimation: ~10ms
|
||||
- **Frame rate**: 5-10 FPS processing (meets <5s requirement)
|
||||
|
||||
### Accuracy
|
||||
- **Relative rotation**: ±2° error
|
||||
- **Relative translation direction**: ±5° error
|
||||
- **Inlier ratio**: >50% for good tracking
|
||||
|
||||
### Reliability
|
||||
- Handle 100m spacing between frames
|
||||
- Survive temporary tracking degradation
|
||||
- Recover from brief occlusions
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal Components
|
||||
- **G15 Model Manager**: For SuperPoint and LightGlue models
|
||||
- **G16 Configuration Manager**: For camera parameters
|
||||
- **H01 Camera Model**: For coordinate normalization
|
||||
- **H05 Performance Monitor**: For timing measurements
|
||||
|
||||
### External Dependencies
|
||||
- **SuperPoint**: Feature extraction model
|
||||
- **LightGlue**: Feature matching model
|
||||
- **opencv-python**: Essential Matrix estimation
|
||||
- **numpy**: Matrix operations
|
||||
|
||||
## Data Models
|
||||
|
||||
### Features
|
||||
```python
|
||||
class Features(BaseModel):
|
||||
keypoints: np.ndarray # (N, 2)
|
||||
descriptors: np.ndarray # (N, 256)
|
||||
scores: np.ndarray # (N,)
|
||||
```
|
||||
|
||||
### Matches
|
||||
```python
|
||||
class Matches(BaseModel):
|
||||
matches: np.ndarray # (M, 2) - pairs of indices
|
||||
scores: np.ndarray # (M,) - match confidence
|
||||
keypoints1: np.ndarray # (M, 2)
|
||||
keypoints2: np.ndarray # (M, 2)
|
||||
```
|
||||
|
||||
### RelativePose
|
||||
```python
|
||||
class RelativePose(BaseModel):
|
||||
translation: np.ndarray # (3,) - unit vector
|
||||
rotation: np.ndarray # (3, 3) or (4,) quaternion
|
||||
confidence: float
|
||||
inlier_count: int
|
||||
total_matches: int
|
||||
tracking_good: bool
|
||||
scale_ambiguous: bool = True
|
||||
```
|
||||
|
||||
### Motion
|
||||
```python
|
||||
class Motion(BaseModel):
|
||||
translation: np.ndarray # (3,)
|
||||
rotation: np.ndarray # (3, 3)
|
||||
inliers: np.ndarray # Boolean mask
|
||||
inlier_count: int
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user