add chunking

This commit is contained in:
Oleksandr Bezdieniezhnykh
2025-11-27 03:43:19 +02:00
parent 4f8c18a066
commit 2037870f67
43 changed files with 7041 additions and 4135 deletions
@@ -0,0 +1,617 @@
# Route Chunk Manager
## Interface Definition
**Interface Name**: `IRouteChunkManager`
### Interface Methods
```python
class IRouteChunkManager(ABC):
@abstractmethod
def create_chunk(self, flight_id: str, start_frame_id: int) -> ChunkHandle:
pass
@abstractmethod
def add_frame_to_chunk(self, chunk_id: str, frame_id: int, vo_result: RelativePose) -> bool:
pass
@abstractmethod
def get_chunk_frames(self, chunk_id: str) -> List[int]:
pass
@abstractmethod
def get_chunk_images(self, chunk_id: str) -> List[np.ndarray]:
pass
@abstractmethod
def get_chunk_composite_descriptor(self, chunk_id: str) -> np.ndarray:
pass
@abstractmethod
def get_chunk_bounds(self, chunk_id: str) -> ChunkBounds:
pass
@abstractmethod
def is_chunk_ready_for_matching(self, chunk_id: str) -> bool:
pass
@abstractmethod
def mark_chunk_anchored(self, chunk_id: str, frame_id: int, gps: GPSPoint) -> bool:
pass
@abstractmethod
def get_chunks_for_matching(self, flight_id: str) -> List[ChunkHandle]:
pass
@abstractmethod
def get_active_chunk(self, flight_id: str) -> Optional[ChunkHandle]:
pass
@abstractmethod
def deactivate_chunk(self, chunk_id: str) -> bool:
pass
@abstractmethod
def merge_chunks(self, chunk_id_1: str, chunk_id_2: str, transform: Sim3Transform) -> bool:
pass
@abstractmethod
def mark_chunk_matching(self, chunk_id: str) -> bool:
pass
```
## Component Description
### Responsibilities
- Manage chunk lifecycle (creation, activation, deactivation, merging)
- Track chunk state (frames, anchors, matching status)
- Coordinate chunk semantic matching and LiteSAM matching
- Provide chunk representations for matching (composite images, descriptors)
- Determine chunk readiness for matching (min frames, consistency)
### Scope
- Chunk lifecycle management
- Chunk state tracking
- Chunk representation generation (descriptors, bounds)
- Integration point for chunk matching coordination
## API Methods
### `create_chunk(flight_id: str, start_frame_id: int) -> ChunkHandle`
**Description**: Creates a new route chunk and initializes it in the factor graph.
**Called By**:
- F02 Flight Processor (when tracking lost)
- F11 Failure Recovery Coordinator (proactive chunk creation)
**Input**:
```python
flight_id: str
start_frame_id: int # First frame in chunk
```
**Output**:
```python
ChunkHandle:
chunk_id: str
flight_id: str
start_frame_id: int
end_frame_id: Optional[int]
frames: List[int]
is_active: bool
has_anchor: bool
anchor_frame_id: Optional[int]
anchor_gps: Optional[GPSPoint]
matching_status: str
```
**Processing Flow**:
1. Generate unique chunk_id
2. Call F10 Factor Graph Optimizer.create_new_chunk()
3. Initialize chunk state (unanchored, active)
4. Store chunk metadata
5. Return ChunkHandle
**Test Cases**:
1. **Create chunk**: Returns ChunkHandle with is_active=True
2. **Multiple chunks**: Can create multiple chunks for same flight
3. **Chunk initialization**: Chunk initialized in factor graph
---
### `add_frame_to_chunk(chunk_id: str, frame_id: int, vo_result: RelativePose) -> bool`
**Description**: Adds a frame to an existing chunk with its VO result.
**Called By**:
- F02 Flight Processor (during frame processing)
**Input**:
```python
chunk_id: str
frame_id: int
vo_result: RelativePose # From F07 Sequential VO
```
**Output**:
```python
bool: True if frame added successfully
```
**Processing Flow**:
1. Verify chunk exists and is active
2. Add frame_id to chunk's frames list
3. Store vo_result for chunk
4. Call F10.add_relative_factor_to_chunk()
5. Update chunk's end_frame_id
6. Check if chunk ready for matching
**Test Cases**:
1. **Add frame to active chunk**: Frame added successfully
2. **Add frame to inactive chunk**: Returns False
3. **Chunk growth**: Chunk frames list updated
---
### `get_chunk_frames(chunk_id: str) -> List[int]`
**Description**: Retrieves list of frame IDs in a chunk.
**Called By**:
- F08 Global Place Recognition (for chunk descriptor computation)
- F09 Metric Refinement (for chunk LiteSAM matching)
- F11 Failure Recovery Coordinator (chunk state queries)
**Input**:
```python
chunk_id: str
```
**Output**:
```python
List[int] # Frame IDs in chunk, ordered by sequence
```
**Test Cases**:
1. **Get frames**: Returns all frames in chunk
2. **Empty chunk**: Returns empty list
3. **Ordered frames**: Frames returned in sequence order
---
### `get_chunk_images(chunk_id: str) -> List[np.ndarray]`
**Description**: Retrieves images for all frames in a chunk.
**Called By**:
- F08 Global Place Recognition (chunk descriptor computation)
- F09 Metric Refinement (chunk LiteSAM matching)
- F06 Image Rotation Manager (chunk rotation)
**Input**:
```python
chunk_id: str
```
**Output**:
```python
List[np.ndarray] # Images for each frame in chunk
```
**Processing Flow**:
1. Get chunk frames via get_chunk_frames()
2. Load images from F05 Image Input Pipeline
3. Return list of images
**Test Cases**:
1. **Get images**: Returns all images in chunk
2. **Image loading**: Images loaded correctly from pipeline
3. **Order consistency**: Images match frame order
---
### `get_chunk_composite_descriptor(chunk_id: str) -> np.ndarray`
**Description**: Computes aggregate DINOv2 descriptor for chunk (for semantic matching).
**Called By**:
- F08 Global Place Recognition (chunk semantic matching)
**Input**:
```python
chunk_id: str
```
**Output**:
```python
np.ndarray: Aggregated descriptor vector (4096-dim or 8192-dim)
```
**Processing Flow**:
1. Get chunk images via get_chunk_images()
2. Call F08.compute_chunk_descriptor(chunk_images) → aggregate descriptor
3. Return composite descriptor
**Delegation**:
- Delegates to F08.compute_chunk_descriptor() for descriptor computation
- F08 handles aggregation logic (mean, VLAD, or max)
- Single source of truth for chunk descriptor computation
**Test Cases**:
1. **Compute descriptor**: Returns aggregated descriptor
2. **Multiple images**: Descriptor aggregates correctly
3. **Descriptor quality**: More robust than single-image descriptor
---
### `get_chunk_bounds(chunk_id: str) -> ChunkBounds`
**Description**: Estimates GPS bounds of a chunk based on VO trajectory.
**Called By**:
- F11 Failure Recovery Coordinator (for tile search area)
- F04 Satellite Data Manager (for tile prefetching)
**Input**:
```python
chunk_id: str
```
**Output**:
```python
ChunkBounds:
estimated_center: GPSPoint
estimated_radius: float # meters
confidence: float
```
**Processing Flow**:
1. Get chunk trajectory from F10.get_chunk_trajectory()
2. If chunk has anchor:
- Use anchor GPS as center
- Compute radius from trajectory extent
3. If chunk unanchored:
- Estimate center from VO trajectory (relative to start)
- Use dead-reckoning estimate
- Lower confidence
4. Return ChunkBounds
**Test Cases**:
1. **Anchored chunk**: Returns accurate bounds with high confidence
2. **Unanchored chunk**: Returns estimated bounds with lower confidence
3. **Bounds calculation**: Radius computed from trajectory extent
---
### `is_chunk_ready_for_matching(chunk_id: str) -> bool`
**Description**: Determines if chunk has enough frames and consistency for matching.
**Called By**:
- F11 Failure Recovery Coordinator (before attempting matching)
- Background matching task
**Input**:
```python
chunk_id: str
```
**Output**:
```python
bool: True if chunk ready for matching
```
**Criteria**:
- **Min frames**: >= 5 frames (configurable)
- **Max frames**: <= 20 frames (configurable, prevents oversized chunks)
- **Internal consistency**: VO factors have reasonable inlier counts
- **Not already matched**: matching_status != "anchored" or "merged"
**Test Cases**:
1. **Ready chunk**: 10 frames, good consistency → True
2. **Too few frames**: 3 frames → False
3. **Already anchored**: has_anchor=True → False
---
### `mark_chunk_anchored(chunk_id: str, frame_id: int, gps: GPSPoint) -> bool`
**Description**: Marks chunk as anchored with GPS coordinate.
**Called By**:
- F11 Failure Recovery Coordinator (after successful chunk matching)
**Input**:
```python
chunk_id: str
frame_id: int # Frame within chunk that was anchored
gps: GPSPoint
```
**Output**:
```python
bool: True if marked successfully
```
**Processing Flow**:
1. Verify chunk exists
2. Call F10.add_chunk_anchor()
3. Update chunk state (has_anchor=True, anchor_frame_id, anchor_gps)
4. Update matching_status to "anchored"
5. Trigger chunk optimization
**Test Cases**:
1. **Mark anchored**: Chunk state updated correctly
2. **Anchor in factor graph**: F10 anchor added
3. **Chunk optimization**: Chunk optimized after anchoring
---
### `get_chunks_for_matching(flight_id: str) -> List[ChunkHandle]`
**Description**: Retrieves all unanchored chunks ready for matching.
**Called By**:
- F11 Failure Recovery Coordinator (background matching task)
- Background processing task
**Input**:
```python
flight_id: str
```
**Output**:
```python
List[ChunkHandle] # Unanchored chunks ready for matching
```
**Processing Flow**:
1. Get all chunks for flight_id
2. Filter chunks where:
- has_anchor == False
- is_chunk_ready_for_matching() == True
- matching_status == "unanchored" or "matching"
3. Return filtered list
**Test Cases**:
1. **Get unanchored chunks**: Returns ready chunks
2. **Filter criteria**: Only returns chunks meeting criteria
3. **Empty result**: Returns empty list if no ready chunks
---
### `get_active_chunk(flight_id: str) -> Optional[ChunkHandle]`
**Description**: Gets the currently active chunk for a flight.
**Called By**:
- F02 Flight Processor (before processing frame)
**Input**:
```python
flight_id: str
```
**Output**:
```python
Optional[ChunkHandle] # Active chunk or None
```
**Test Cases**:
1. **Get active chunk**: Returns active chunk
2. **No active chunk**: Returns None
3. **Multiple chunks**: Returns only active chunk
---
### `deactivate_chunk(chunk_id: str) -> bool`
**Description**: Deactivates a chunk (typically after merging or completion).
**Called By**:
- F11 Failure Recovery Coordinator (after chunk merged)
- F02 Flight Processor (chunk lifecycle)
**Input**:
```python
chunk_id: str
```
**Output**:
```python
bool: True if deactivated successfully
```
**Processing Flow**:
1. Verify chunk exists
2. Update chunk state (is_active=False)
3. Update matching_status to "merged" if merged
4. Return True
**Test Cases**:
1. **Deactivate chunk**: Chunk marked inactive
2. **After merge**: Matching status updated to "merged"
---
### `merge_chunks(chunk_id_1: str, chunk_id_2: str, transform: Sim3Transform) -> bool`
**Description**: Coordinates chunk merging by validating chunks, calling F10 for factor graph merge, and updating chunk states.
**Called By**:
- F11 Failure Recovery Coordinator (after successful chunk matching)
**Input**:
```python
chunk_id_1: str # Source chunk (typically newer, to be merged)
chunk_id_2: str # Target chunk (typically older, merged into)
transform: Sim3Transform:
translation: np.ndarray # (3,)
rotation: np.ndarray # (3, 3) or quaternion
scale: float
```
**Output**:
```python
bool: True if merge successful
```
**Processing Flow**:
1. Verify both chunks exist
2. Verify chunk_id_1 is anchored (has_anchor=True)
3. Validate chunks can be merged (not already merged, not same chunk)
4. Call F10.merge_chunks(chunk_id_1, chunk_id_2, transform)
5. Update chunk_id_1 state:
- Set is_active=False
- Set matching_status="merged"
- Call deactivate_chunk(chunk_id_1)
6. Update chunk_id_2 state (if needed)
7. Return True
**Validation**:
- Both chunks must exist
- chunk_id_1 must be anchored
- chunk_id_1 must not already be merged
- chunk_id_1 and chunk_id_2 must be different
**Test Cases**:
1. **Merge anchored chunks**: Chunks merged successfully, chunk_id_1 deactivated
2. **Merge unanchored chunk**: Returns False (validation fails)
3. **Merge already merged chunk**: Returns False (validation fails)
4. **State updates**: chunk_id_1 marked as merged and deactivated
---
### `mark_chunk_matching(chunk_id: str) -> bool`
**Description**: Explicitly marks chunk as being matched (updates matching_status to "matching").
**Called By**:
- F11 Failure Recovery Coordinator (when chunk matching starts)
**Input**:
```python
chunk_id: str
```
**Output**:
```python
bool: True if marked successfully
```
**Processing Flow**:
1. Verify chunk exists
2. Verify chunk is unanchored (has_anchor=False)
3. Update matching_status to "matching"
4. Return True
**State Transition**:
- `unanchored``matching` (explicit transition)
**Test Cases**:
1. **Mark unanchored chunk**: Status updated to "matching"
2. **Mark already anchored chunk**: Returns False (invalid state)
3. **Mark non-existent chunk**: Returns False
## Integration Tests
### Test 1: Chunk Lifecycle
1. create_chunk() → chunk created
2. add_frame_to_chunk() × 10 → 10 frames added
3. is_chunk_ready_for_matching() → True
4. mark_chunk_anchored() → chunk anchored
5. deactivate_chunk() → chunk deactivated
### Test 2: Chunk Descriptor Computation
1. Create chunk with 10 frames
2. get_chunk_images() → 10 images
3. get_chunk_composite_descriptor() → aggregated descriptor
4. Verify descriptor more robust than single-image descriptor
### Test 3: Multiple Chunks
1. Create chunk_1 (frames 1-10)
2. Create chunk_2 (frames 20-30)
3. get_chunks_for_matching() → returns both chunks
4. mark_chunk_anchored(chunk_1) → chunk_1 anchored
5. get_chunks_for_matching() → returns only chunk_2
### Test 4: Chunk Merging
1. Create chunk_1 (frames 1-10), chunk_2 (frames 20-30)
2. Anchor chunk_1 via mark_chunk_anchored()
3. merge_chunks(chunk_1, chunk_2, transform) → chunks merged
4. Verify chunk_1 marked as merged and deactivated
5. Verify F10 merge_chunks() called
### Test 5: Chunk Matching Status
1. Create chunk
2. mark_chunk_matching() → status updated to "matching"
3. mark_chunk_anchored() → status updated to "anchored"
4. Verify explicit state transitions
## Non-Functional Requirements
### Performance
- **create_chunk**: < 10ms
- **add_frame_to_chunk**: < 5ms
- **get_chunk_composite_descriptor**: < 3s for 20 images (async)
- **get_chunk_bounds**: < 10ms
### Reliability
- Chunk state persisted across restarts
- Graceful handling of missing frames
- Thread-safe chunk operations
## Dependencies
### Internal Components
- **F10 Factor Graph Optimizer**: Chunk creation and factor management
- **F05 Image Input Pipeline**: Image retrieval
- **F08 Global Place Recognition**: Descriptor computation
- **F07 Sequential VO**: VO results for chunk building
### External Dependencies
- **numpy**: Array operations
## Data Models
### ChunkHandle
```python
class ChunkHandle(BaseModel):
chunk_id: str
flight_id: str
start_frame_id: int
end_frame_id: Optional[int]
frames: List[int]
is_active: bool
has_anchor: bool
anchor_frame_id: Optional[int]
anchor_gps: Optional[GPSPoint]
matching_status: str # "unanchored", "matching", "anchored", "merged"
```
### ChunkBounds
```python
class ChunkBounds(BaseModel):
estimated_center: GPSPoint
estimated_radius: float # meters
confidence: float # 0.0 to 1.0
```
### ChunkConfig
```python
class ChunkConfig(BaseModel):
min_frames_for_matching: int = 5
max_frames_per_chunk: int = 20
descriptor_aggregation: str = "mean" # "mean", "vlad", "max"
```
### Sim3Transform
```python
class Sim3Transform(BaseModel):
translation: np.ndarray # (3,) - translation vector
rotation: np.ndarray # (3, 3) rotation matrix or (4,) quaternion
scale: float # Scale factor
```