add features

This commit is contained in:
Oleksandr Bezdieniezhnykh
2025-12-01 01:07:46 +02:00
parent 97f558b3d7
commit 54be35fde7
81 changed files with 4618 additions and 10 deletions
@@ -0,0 +1,63 @@
# Feature: Feature Extraction
## Description
SuperPoint-based keypoint and descriptor extraction from UAV images. Provides the foundation for visual odometry by detecting repeatable keypoints and computing discriminative 256-dimensional descriptors.
## Component APIs Implemented
- `extract_features(image: np.ndarray) -> Features`
## External Tools and Services
- **SuperPoint**: Neural network model for keypoint detection and descriptor extraction
- **F16 Model Manager**: Provides pre-loaded SuperPoint model instance
## Internal Methods
### `_preprocess_image(image: np.ndarray) -> np.ndarray`
Converts image to grayscale if needed, normalizes pixel values for model input.
### `_run_superpoint_inference(preprocessed: np.ndarray) -> Tuple[np.ndarray, np.ndarray, np.ndarray]`
Executes SuperPoint model inference, returns raw keypoints, descriptors, and scores.
### `_apply_nms(keypoints: np.ndarray, scores: np.ndarray, nms_radius: int) -> np.ndarray`
Non-maximum suppression to filter keypoints, typically keeps 500-2000 keypoints per image.
## Unit Tests
### Test: Grayscale Conversion
- Input: RGB image (H×W×3)
- Verify: _preprocess_image returns grayscale (H×W)
### Test: Grayscale Passthrough
- Input: Already grayscale image (H×W)
- Verify: _preprocess_image returns unchanged
### Test: Feature Count Range
- Input: Standard UAV image
- Verify: Returns 500-2000 keypoints
### Test: Descriptor Dimensions
- Input: Any valid image
- Verify: Descriptors shape is (N, 256)
### Test: Empty Image Handling
- Input: Black/invalid image
- Verify: Returns empty Features (never raises exception)
### Test: High Resolution Image
- Input: 6252×4168 image
- Verify: Extracts ~2000 keypoints within performance budget
### Test: Low Texture Image
- Input: Uniform texture (sky, water)
- Verify: Returns fewer keypoints gracefully
## Integration Tests
### Test: Model Manager Integration
- Verify: Successfully retrieves SuperPoint model from F16
- Verify: Model loaded with correct TensorRT/ONNX backend
### Test: Performance Budget
- Input: FullHD image
- Verify: Extraction completes in <15ms on RTX 2060
@@ -0,0 +1,73 @@
# Feature: Feature Matching
## Description
LightGlue-based attention matching between feature sets from consecutive frames. Handles challenging low-overlap scenarios (<5%) using transformer-based attention mechanism with adaptive depth.
## Component APIs Implemented
- `match_features(features1: Features, features2: Features) -> Matches`
## External Tools and Services
- **LightGlue**: Transformer-based feature matcher with adaptive depth
- **F16 Model Manager**: Provides pre-loaded LightGlue model instance
## Internal Methods
### `_prepare_features_for_lightglue(features: Features) -> Dict`
Formats Features dataclass into LightGlue-compatible tensor format.
### `_run_lightglue_inference(features1_dict: Dict, features2_dict: Dict) -> Tuple[np.ndarray, np.ndarray]`
Executes LightGlue inference, returns match indices and confidence scores. Uses adaptive depth (exits early for easy matches).
### `_filter_matches_by_confidence(matches: np.ndarray, scores: np.ndarray, threshold: float) -> Tuple[np.ndarray, np.ndarray]`
Filters matches below confidence threshold (dustbin mechanism).
### `_extract_matched_keypoints(features1: Features, features2: Features, match_indices: np.ndarray) -> Tuple[np.ndarray, np.ndarray]`
Extracts matched keypoint coordinates from both feature sets using match indices.
## Unit Tests
### Test: High Overlap Matching
- Input: Features from frames with >50% overlap
- Verify: Returns 500+ matches
- Verify: Inference time ~35ms (fast path)
### Test: Low Overlap Matching
- Input: Features from frames with 5-10% overlap
- Verify: Returns 20-50 matches
- Verify: Inference time ~100ms (full depth)
### Test: No Overlap Handling
- Input: Features from non-overlapping frames
- Verify: Returns <10 matches
- Verify: No exception raised
### Test: Match Index Validity
- Input: Any valid feature pairs
- Verify: All match indices within valid range for both feature sets
### Test: Confidence Score Range
- Input: Any valid feature pairs
- Verify: All scores in [0, 1] range
### Test: Empty Features Handling
- Input: Empty Features object
- Verify: Returns empty Matches (no exception)
### Test: Matched Keypoints Extraction
- Input: Features and match indices
- Verify: keypoints1 and keypoints2 arrays have same length as matches
## Integration Tests
### Test: Model Manager Integration
- Verify: Successfully retrieves LightGlue model from F16
- Verify: Model compatible with SuperPoint descriptors
### Test: Adaptive Depth Behavior
- Input: High overlap pair, then low overlap pair
- Verify: High overlap completes faster than low overlap
### Test: Agricultural Texture Handling
- Input: Features from repetitive wheat field images
- Verify: Produces valid matches despite repetitive patterns
@@ -0,0 +1,109 @@
# Feature: Relative Pose Computation
## Description
Orchestrates the full visual odometry pipeline and estimates camera motion from matched features using Essential Matrix decomposition. Computes relative pose between consecutive frames and provides tracking quality indicators.
## Component APIs Implemented
- `compute_relative_pose(prev_image: np.ndarray, curr_image: np.ndarray) -> Optional[RelativePose]`
- `estimate_motion(matches: Matches, camera_params: CameraParameters) -> Optional[Motion]`
## External Tools and Services
- **opencv-python**: Essential Matrix estimation via RANSAC, matrix decomposition
- **numpy**: Matrix operations, coordinate normalization
- **F17 Configuration Manager**: Camera parameters (focal length, principal point)
- **H01 Camera Model**: Coordinate normalization utilities
## Internal Methods
### `_normalize_keypoints(keypoints: np.ndarray, camera_params: CameraParameters) -> np.ndarray`
Normalizes pixel coordinates to camera-centered coordinates using intrinsic matrix.
### `_estimate_essential_matrix(points1: np.ndarray, points2: np.ndarray) -> Tuple[np.ndarray, np.ndarray]`
RANSAC-based Essential Matrix estimation, returns E matrix and inlier mask.
### `_decompose_essential_matrix(E: np.ndarray, points1: np.ndarray, points2: np.ndarray) -> Tuple[np.ndarray, np.ndarray]`
Decomposes Essential Matrix into rotation R and translation t (unit vector).
### `_compute_tracking_quality(inlier_count: int, total_matches: int) -> Tuple[float, bool]`
Computes confidence score and tracking_good flag based on inlier statistics.
- Good: inlier_count > 50, inlier_ratio > 0.5
- Degraded: inlier_count 20-50
- Lost: inlier_count < 20
### `_build_relative_pose(motion: Motion, matches: Matches) -> RelativePose`
Constructs RelativePose dataclass from motion estimate and match statistics.
## Unit Tests
### Test: Keypoint Normalization
- Input: Pixel coordinates and camera params
- Verify: Output centered at principal point, scaled by focal length
### Test: Essential Matrix Estimation - Good Data
- Input: 100+ inlier correspondences
- Verify: Returns valid Essential Matrix (det ≈ 0, singular values ratio)
### Test: Essential Matrix Estimation - Insufficient Points
- Input: <8 point correspondences
- Verify: Returns None
### Test: Essential Matrix Decomposition
- Input: Valid Essential Matrix
- Verify: Returns valid rotation (det = 1) and unit translation
### Test: Tracking Quality - Good
- Input: inlier_count=100, total_matches=150
- Verify: tracking_good=True, confidence>0.5
### Test: Tracking Quality - Degraded
- Input: inlier_count=30, total_matches=50
- Verify: tracking_good=True, confidence reduced
### Test: Tracking Quality - Lost
- Input: inlier_count=10, total_matches=20
- Verify: tracking_good=False
### Test: Scale Ambiguity
- Input: Any valid motion estimate
- Verify: translation vector has unit norm (||t|| = 1)
- Verify: scale_ambiguous flag is True
### Test: Pure Rotation Handling
- Input: Matches from pure rotational motion
- Verify: Returns valid pose (translation ≈ 0)
## Integration Tests
### Test: Full Pipeline - Normal Flight
- Input: Consecutive frames with 50% overlap
- Verify: Returns valid RelativePose
- Verify: inlier_count > 100
- Verify: Total time < 200ms
### Test: Full Pipeline - Low Overlap
- Input: Frames with 5% overlap
- Verify: Returns valid RelativePose
- Verify: inlier_count > 20
### Test: Full Pipeline - Tracking Loss
- Input: Non-overlapping frames (sharp turn)
- Verify: Returns None
- Verify: tracking_good would be False
### Test: Configuration Manager Integration
- Verify: Successfully retrieves camera_params from F17
- Verify: Parameters match expected resolution and focal length
### Test: Camera Model Integration
- Verify: H01 normalization produces correct coordinates
- Verify: Consistent with opencv undistortion
### Test: Pipeline Orchestration
- Verify: extract_features called twice (prev, curr)
- Verify: match_features called once
- Verify: estimate_motion called with correct params
### Test: Agricultural Environment
- Input: Wheat field images with repetitive texture
- Verify: Pipeline succeeds with reasonable inlier count
@@ -0,0 +1,63 @@
# Feature: Feature Extraction
## Description
SuperPoint-based keypoint and descriptor extraction from UAV images. Provides the foundation for visual odometry by detecting repeatable keypoints and computing discriminative 256-dimensional descriptors.
## Component APIs Implemented
- `extract_features(image: np.ndarray) -> Features`
## External Tools and Services
- **SuperPoint**: Neural network model for keypoint detection and descriptor extraction
- **F16 Model Manager**: Provides pre-loaded SuperPoint model instance
## Internal Methods
### `_preprocess_image(image: np.ndarray) -> np.ndarray`
Converts image to grayscale if needed, normalizes pixel values for model input.
### `_run_superpoint_inference(preprocessed: np.ndarray) -> Tuple[np.ndarray, np.ndarray, np.ndarray]`
Executes SuperPoint model inference, returns raw keypoints, descriptors, and scores.
### `_apply_nms(keypoints: np.ndarray, scores: np.ndarray, nms_radius: int) -> np.ndarray`
Non-maximum suppression to filter keypoints, typically keeps 500-2000 keypoints per image.
## Unit Tests
### Test: Grayscale Conversion
- Input: RGB image (H×W×3)
- Verify: _preprocess_image returns grayscale (H×W)
### Test: Grayscale Passthrough
- Input: Already grayscale image (H×W)
- Verify: _preprocess_image returns unchanged
### Test: Feature Count Range
- Input: Standard UAV image
- Verify: Returns 500-2000 keypoints
### Test: Descriptor Dimensions
- Input: Any valid image
- Verify: Descriptors shape is (N, 256)
### Test: Empty Image Handling
- Input: Black/invalid image
- Verify: Returns empty Features (never raises exception)
### Test: High Resolution Image
- Input: 6252×4168 image
- Verify: Extracts ~2000 keypoints within performance budget
### Test: Low Texture Image
- Input: Uniform texture (sky, water)
- Verify: Returns fewer keypoints gracefully
## Integration Tests
### Test: Model Manager Integration
- Verify: Successfully retrieves SuperPoint model from F16
- Verify: Model loaded with correct TensorRT/ONNX backend
### Test: Performance Budget
- Input: FullHD image
- Verify: Extraction completes in <15ms on RTX 2060
@@ -0,0 +1,73 @@
# Feature: Feature Matching
## Description
LightGlue-based attention matching between feature sets from consecutive frames. Handles challenging low-overlap scenarios (<5%) using transformer-based attention mechanism with adaptive depth.
## Component APIs Implemented
- `match_features(features1: Features, features2: Features) -> Matches`
## External Tools and Services
- **LightGlue**: Transformer-based feature matcher with adaptive depth
- **F16 Model Manager**: Provides pre-loaded LightGlue model instance
## Internal Methods
### `_prepare_features_for_lightglue(features: Features) -> Dict`
Formats Features dataclass into LightGlue-compatible tensor format.
### `_run_lightglue_inference(features1_dict: Dict, features2_dict: Dict) -> Tuple[np.ndarray, np.ndarray]`
Executes LightGlue inference, returns match indices and confidence scores. Uses adaptive depth (exits early for easy matches).
### `_filter_matches_by_confidence(matches: np.ndarray, scores: np.ndarray, threshold: float) -> Tuple[np.ndarray, np.ndarray]`
Filters matches below confidence threshold (dustbin mechanism).
### `_extract_matched_keypoints(features1: Features, features2: Features, match_indices: np.ndarray) -> Tuple[np.ndarray, np.ndarray]`
Extracts matched keypoint coordinates from both feature sets using match indices.
## Unit Tests
### Test: High Overlap Matching
- Input: Features from frames with >50% overlap
- Verify: Returns 500+ matches
- Verify: Inference time ~35ms (fast path)
### Test: Low Overlap Matching
- Input: Features from frames with 5-10% overlap
- Verify: Returns 20-50 matches
- Verify: Inference time ~100ms (full depth)
### Test: No Overlap Handling
- Input: Features from non-overlapping frames
- Verify: Returns <10 matches
- Verify: No exception raised
### Test: Match Index Validity
- Input: Any valid feature pairs
- Verify: All match indices within valid range for both feature sets
### Test: Confidence Score Range
- Input: Any valid feature pairs
- Verify: All scores in [0, 1] range
### Test: Empty Features Handling
- Input: Empty Features object
- Verify: Returns empty Matches (no exception)
### Test: Matched Keypoints Extraction
- Input: Features and match indices
- Verify: keypoints1 and keypoints2 arrays have same length as matches
## Integration Tests
### Test: Model Manager Integration
- Verify: Successfully retrieves LightGlue model from F16
- Verify: Model compatible with SuperPoint descriptors
### Test: Adaptive Depth Behavior
- Input: High overlap pair, then low overlap pair
- Verify: High overlap completes faster than low overlap
### Test: Agricultural Texture Handling
- Input: Features from repetitive wheat field images
- Verify: Produces valid matches despite repetitive patterns
@@ -0,0 +1,109 @@
# Feature: Relative Pose Computation
## Description
Orchestrates the full visual odometry pipeline and estimates camera motion from matched features using Essential Matrix decomposition. Computes relative pose between consecutive frames and provides tracking quality indicators.
## Component APIs Implemented
- `compute_relative_pose(prev_image: np.ndarray, curr_image: np.ndarray) -> Optional[RelativePose]`
- `estimate_motion(matches: Matches, camera_params: CameraParameters) -> Optional[Motion]`
## External Tools and Services
- **opencv-python**: Essential Matrix estimation via RANSAC, matrix decomposition
- **numpy**: Matrix operations, coordinate normalization
- **F17 Configuration Manager**: Camera parameters (focal length, principal point)
- **H01 Camera Model**: Coordinate normalization utilities
## Internal Methods
### `_normalize_keypoints(keypoints: np.ndarray, camera_params: CameraParameters) -> np.ndarray`
Normalizes pixel coordinates to camera-centered coordinates using intrinsic matrix.
### `_estimate_essential_matrix(points1: np.ndarray, points2: np.ndarray) -> Tuple[np.ndarray, np.ndarray]`
RANSAC-based Essential Matrix estimation, returns E matrix and inlier mask.
### `_decompose_essential_matrix(E: np.ndarray, points1: np.ndarray, points2: np.ndarray) -> Tuple[np.ndarray, np.ndarray]`
Decomposes Essential Matrix into rotation R and translation t (unit vector).
### `_compute_tracking_quality(inlier_count: int, total_matches: int) -> Tuple[float, bool]`
Computes confidence score and tracking_good flag based on inlier statistics.
- Good: inlier_count > 50, inlier_ratio > 0.5
- Degraded: inlier_count 20-50
- Lost: inlier_count < 20
### `_build_relative_pose(motion: Motion, matches: Matches) -> RelativePose`
Constructs RelativePose dataclass from motion estimate and match statistics.
## Unit Tests
### Test: Keypoint Normalization
- Input: Pixel coordinates and camera params
- Verify: Output centered at principal point, scaled by focal length
### Test: Essential Matrix Estimation - Good Data
- Input: 100+ inlier correspondences
- Verify: Returns valid Essential Matrix (det ≈ 0, singular values ratio)
### Test: Essential Matrix Estimation - Insufficient Points
- Input: <8 point correspondences
- Verify: Returns None
### Test: Essential Matrix Decomposition
- Input: Valid Essential Matrix
- Verify: Returns valid rotation (det = 1) and unit translation
### Test: Tracking Quality - Good
- Input: inlier_count=100, total_matches=150
- Verify: tracking_good=True, confidence>0.5
### Test: Tracking Quality - Degraded
- Input: inlier_count=30, total_matches=50
- Verify: tracking_good=True, confidence reduced
### Test: Tracking Quality - Lost
- Input: inlier_count=10, total_matches=20
- Verify: tracking_good=False
### Test: Scale Ambiguity
- Input: Any valid motion estimate
- Verify: translation vector has unit norm (||t|| = 1)
- Verify: scale_ambiguous flag is True
### Test: Pure Rotation Handling
- Input: Matches from pure rotational motion
- Verify: Returns valid pose (translation ≈ 0)
## Integration Tests
### Test: Full Pipeline - Normal Flight
- Input: Consecutive frames with 50% overlap
- Verify: Returns valid RelativePose
- Verify: inlier_count > 100
- Verify: Total time < 200ms
### Test: Full Pipeline - Low Overlap
- Input: Frames with 5% overlap
- Verify: Returns valid RelativePose
- Verify: inlier_count > 20
### Test: Full Pipeline - Tracking Loss
- Input: Non-overlapping frames (sharp turn)
- Verify: Returns None
- Verify: tracking_good would be False
### Test: Configuration Manager Integration
- Verify: Successfully retrieves camera_params from F17
- Verify: Parameters match expected resolution and focal length
### Test: Camera Model Integration
- Verify: H01 normalization produces correct coordinates
- Verify: Consistent with opencv undistortion
### Test: Pipeline Orchestration
- Verify: extract_features called twice (prev, curr)
- Verify: match_features called once
- Verify: estimate_motion called with correct params
### Test: Agricultural Environment
- Input: Wheat field images with repetitive texture
- Verify: Pipeline succeeds with reasonable inlier count