diff --git a/.cursor/commands/2.planning/2.20_gen_epics.md b/.cursor/commands/2.planning/2.20_gen_epics.md index 7a97347..072f679 100644 --- a/.cursor/commands/2.planning/2.20_gen_epics.md +++ b/.cursor/commands/2.planning/2.20_gen_epics.md @@ -65,4 +65,6 @@ - Tasks - Technical enablers + ## Notes + - Be as much concise as possible in formulating epics. The less words with the same meaning - the better epic is. diff --git a/docs/03_tests/00_test_summary.md b/docs/03_tests/00_test_summary.md index d94c7ad..d439418 100644 --- a/docs/03_tests/00_test_summary.md +++ b/docs/03_tests/00_test_summary.md @@ -3,7 +3,7 @@ ## Overview Comprehensive test specifications for the GPS-denied navigation system following the QA testing pyramid approach. -**Total Test Specifications**: 56 +**Total Test Specifications**: 49 ## Test Organization @@ -11,65 +11,77 @@ Comprehensive test specifications for the GPS-denied navigation system following Tests individual system components in isolation with their dependencies. **Vision Pipeline (01-04)**: -- 01: Sequential Visual Odometry (SuperPoint + LightGlue) -- 02: Global Place Recognition (AnyLoc) -- 03: Metric Refinement (LiteSAM) -- 04: Factor Graph Optimizer (GTSAM) +- 01: Sequential Visual Odometry (F07 - SuperPoint + LightGlue) +- 02: Global Place Recognition (F08 - AnyLoc/DINOv2) +- 03: Metric Refinement (F09 - LiteSAM) +- 04: Factor Graph Optimizer (F10 - GTSAM) **Data Management (05-08)**: -- 05: Satellite Data Manager -- 06: Coordinate Transformer -- 07: Image Input Pipeline -- 08: Image Rotation Manager +- 05: Satellite Data Manager (F04) +- 06: Coordinate Transformer (F13) +- 07: Image Input Pipeline (F05) +- 08: Image Rotation Manager (F06) **Service Infrastructure (09-12)**: -- 09: REST API (FastAPI endpoints for flight management, image uploads, user fixes) -- 10: SSE Event Streamer (Server-Sent Events for real-time result streaming) -- 11: Flight Manager -- 12: Result Manager +- 09: REST API (F01 - FastAPI endpoints) +- 10: SSE Event Streamer (F15 - real-time streaming) +- 11a: Flight Lifecycle Manager (F02.1 - CRUD, initialization, API delegation) +- 11b: Flight Processing Engine (F02.2 - processing loop, recovery coordination) +- 12: Result Manager (F14) **Support Components (13-16)**: -- 13: Model Manager (TensorRT) -- 14: Failure Recovery Coordinator -- 15: Configuration Manager -- 16: Database Layer +- 13: Model Manager (F16 - TensorRT) +- 14: Failure Recovery Coordinator (F11) +- 15: Configuration Manager (F17) +- 16: Database Layer (F03) ### System Integration Tests (21-25): Multi-Component Flows Tests integration between multiple components. - 21: End-to-End Normal Flight -- 22: Satellite to Vision Pipeline -- 23: Vision to Optimization Pipeline +- 22: Satellite to Vision Pipeline (F04 → F07/F08/F09) +- 23: Vision to Optimization Pipeline (F07/F08/F09 → F10) - 24: Multi-Component Error Propagation -- 25: Real-Time Streaming Pipeline +- 25: Real-Time Streaming Pipeline (F02 → F14 → F15) -### Acceptance Tests (31-43): Requirements Validation +### Acceptance Tests (31-50): Requirements Validation Tests mapped to 10 acceptance criteria. -**Accuracy (31-32)**: -- 31: AC-1 - 80% < 50m error -- 32: AC-2 - 60% < 20m error +**Accuracy (31-33)**: +- 31: AC-1 - 80% < 50m error (baseline) +- 32: AC-1 - 80% < 50m error (varied terrain) +- 33: AC-2 - 60% < 20m error (high precision) -**Robustness (33-35)**: -- 33: AC-3 - 350m outlier handling -- 34: AC-4 - Sharp turn recovery -- 35: AC-5 - Multi-fragment route connection +**Robustness - Outliers (34-35)**: +- 34: AC-3 - Single 350m outlier handling +- 35: AC-3 - Multiple outliers handling -**User Interaction (36)**: -- 36: AC-6 - User input after 3 failures +**Robustness - Sharp Turns (36-38)**: +- 36: AC-4 - Sharp turn zero overlap recovery +- 37: AC-4 - Sharp turn minimal overlap (<5%) +- 38: Outlier anchor detection -**Performance (37-38)**: -- 37: AC-7 - <5s per image -- 38: AC-8 - Real-time streaming + refinement +**Multi-Fragment (39)**: +- 39: AC-5 - Multi-fragment route connection (chunk architecture) -**Quality Metrics (39-40)**: -- 39: AC-9 - Registration rate >95% -- 40: AC-10 - MRE <1.0 pixels +**User Interaction (40)**: +- 40: AC-6 - User input after 3 consecutive failures -**Cross-Cutting (41-43)**: -- 41: Long flight (3000 images) -- 42: Degraded satellite data -- 43: Complete system validation +**Performance (41-44)**: +- 41: AC-7 - <5s single image processing +- 42: AC-7 - Sustained throughput performance +- 43: AC-8 - Real-time streaming results +- 44: AC-8 - Async refinement delivery + +**Quality Metrics (45-47)**: +- 45: AC-9 - Registration rate >95% (baseline) +- 46: AC-9 - Registration rate >95% (challenging conditions) +- 47: AC-10 - Mean Reprojection Error <1.0 pixels + +**Cross-Cutting (48-50)**: +- 48: Long flight (3000 images) +- 49: Degraded satellite data +- 50: Complete system acceptance validation **Chunk-Based Recovery (55-56)**: - 55: Chunk rotation recovery (rotation sweeps for chunks) @@ -109,16 +121,39 @@ Tests using GPS-analyzed test datasets. | AC | Requirement | Test Specs | Status | |----|-------------|------------|--------| -| AC-1 | 80% < 50m error | 31, 43, 51, 54 | ✓ Covered | -| AC-2 | 60% < 20m error | 32, 43, 51, 54 | ✓ Covered | -| AC-3 | 350m outlier robust | 33, 43, 52, 54 | ✓ Covered | -| AC-4 | Sharp turn <5% overlap | 34, 43, 53, 54, 55 | ✓ Covered | -| AC-5 | Multi-fragment connection | 35, 39, 43, 56 | ✓ Covered | -| AC-6 | User input after 3 failures | 36, 43 | ✓ Covered | -| AC-7 | <5s per image | 37, 43, 51, 54 | ✓ Covered | -| AC-8 | Real-time + refinement | 38, 43 | ✓ Covered | -| AC-9 | Registration >95% | 39, 43, 51, 54 | ✓ Covered | -| AC-10 | MRE <1.0px | 40, 43 | ✓ Covered | +| AC-1 | 80% < 50m error | 31, 32, 50, 51, 54 | ✓ Covered | +| AC-2 | 60% < 20m error | 33, 50, 51, 54 | ✓ Covered | +| AC-3 | 350m outlier robust | 34, 35, 50, 52, 54 | ✓ Covered | +| AC-4 | Sharp turn <5% overlap | 36, 37, 50, 53, 54, 55 | ✓ Covered | +| AC-5 | Multi-fragment connection | 39, 50, 56 | ✓ Covered | +| AC-6 | User input after 3 failures | 40, 50 | ✓ Covered | +| AC-7 | <5s per image | 41, 42, 50, 51, 54 | ✓ Covered | +| AC-8 | Real-time + refinement | 43, 44, 50 | ✓ Covered | +| AC-9 | Registration >95% | 45, 46, 50, 51, 54 | ✓ Covered | +| AC-10 | MRE <1.0px | 47, 50 | ✓ Covered | + +## Component to Test Mapping + +| Component | ID | Integration Test | +|-----------|-----|------------------| +| Flight API | F01 | 09 | +| Flight Lifecycle Manager | F02.1 | 11a | +| Flight Processing Engine | F02.2 | 11b | +| Flight Database | F03 | 16 | +| Satellite Data Manager | F04 | 05 | +| Image Input Pipeline | F05 | 07 | +| Image Rotation Manager | F06 | 08 | +| Sequential Visual Odometry | F07 | 01 | +| Global Place Recognition | F08 | 02 | +| Metric Refinement | F09 | 03 | +| Factor Graph Optimizer | F10 | 04 | +| Failure Recovery Coordinator | F11 | 14 | +| Route Chunk Manager | F12 | 39, 55, 56 | +| Coordinate Transformer | F13 | 06 | +| Result Manager | F14 | 12 | +| SSE Event Streamer | F15 | 10 | +| Model Manager | F16 | 13 | +| Configuration Manager | F17 | 15 | ## Test Execution Strategy @@ -132,14 +167,15 @@ Tests using GPS-analyzed test datasets. - Validate end-to-end flows - Verify error handling across components -### Phase 3: Acceptance Testing (31-43) +### Phase 3: Acceptance Testing (31-50) - Validate all acceptance criteria - Use GPS-analyzed real data - Measure against requirements -### Phase 4: Special Scenarios (51-54) +### Phase 4: Special Scenarios (51-56) - Test specific GPS-identified situations - Validate outliers and sharp turns +- Chunk-based recovery scenarios - Full system validation ## Success Criteria Summary @@ -164,4 +200,3 @@ Tests using GPS-analyzed test datasets. - Specifications ready for QA team implementation - No code included per requirement - Tests cover all components and all acceptance criteria - diff --git a/docs/03_tests/04_factor_graph_optimizer_integration_spec.md b/docs/03_tests/04_factor_graph_optimizer_integration_spec.md index 5933f3b..73670a2 100644 --- a/docs/03_tests/04_factor_graph_optimizer_integration_spec.md +++ b/docs/03_tests/04_factor_graph_optimizer_integration_spec.md @@ -1,15 +1,18 @@ -# Integration Test: Factor Graph Optimizer +# Integration Test: Factor Graph Optimizer (F10) ## Summary -Validate the Factor Graph Optimizer component using GTSAM to fuse sequential relative poses (L1) and absolute GPS anchors (L3) into a globally consistent trajectory. +Validate the Factor Graph Optimizer component using GTSAM to fuse sequential relative poses (L1) and absolute GPS anchors (L3) into a globally consistent trajectory, with native multi-chunk support for disconnected route segments. ## Component Under Test -**Component**: Factor Graph Optimizer -**Location**: `gps_denied_10_factor_graph_optimizer` +**Component**: Factor Graph Optimizer (F10) +**Interface**: `IFactorGraphOptimizer` **Dependencies**: -- Sequential Visual Odometry (L1) - provides relative factors -- Metric Refinement (L3) - provides absolute GPS factors -- Coordinate Transformer +- F07 Sequential Visual Odometry - provides relative factors +- F09 Metric Refinement - provides absolute GPS factors +- F12 Route Chunk Manager - chunk lifecycle (F10 provides low-level graph ops) +- F13 Coordinate Transformer +- H02 GSD Calculator - scale resolution +- H03 Robust Kernels - outlier handling - GTSAM library ## Detailed Description @@ -59,6 +62,78 @@ The optimizer is the "brain" of ASTRAL-Next, reconciling potentially conflicting - **Input**: Add measurements incrementally (simulate real-time operation) - **Expected**: Trajectory should converge smoothly, past poses may be refined +### Test Case 7: Create Chunk Subgraph +- **Input**: flight_id, chunk_id, start_frame_id = 20 +- **Expected**: + - create_chunk_subgraph() returns True + - New subgraph created for chunk + - Chunk isolated from main trajectory + +### Test Case 8: Add Relative Factors to Chunk +- **Chunk**: chunk_2 with frames 20-30 +- **Input**: 10 relative factors from VO +- **Expected**: + - add_relative_factor_to_chunk() returns True for each + - Factors added to chunk's subgraph only + - Main trajectory unaffected + +### Test Case 9: Add Chunk Anchor +- **Chunk**: chunk_2 (frames 20-30, unanchored) +- **Input**: GPS anchor at frame 25 +- **Expected**: + - add_chunk_anchor() returns True + - Chunk can now be merged + - Chunk optimization triggered + +### Test Case 10: Optimize Chunk +- **Chunk**: chunk_2 with anchor +- **Input**: optimize_chunk(chunk_id, iterations=10) +- **Expected**: + - Returns OptimizationResult + - converged = True + - Chunk trajectory consistent + - Other chunks unaffected + +### Test Case 11: Merge Chunk Subgraphs +- **Chunks**: chunk_1 (frames 1-10), chunk_2 (frames 20-30, anchored) +- **Input**: merge_chunk_subgraphs(flight_id, chunk_2, chunk_1, transform) +- **Expected**: + - Returns True + - chunk_2 merged into chunk_1 + - Sim(3) transform applied correctly + - Global consistency maintained + +### Test Case 12: Get Chunk Trajectory +- **Chunk**: chunk_2 with 10 frames +- **Input**: get_chunk_trajectory(flight_id, chunk_id) +- **Expected**: + - Returns Dict[int, Pose] with 10 frames + - Poses in chunk's coordinate system + +### Test Case 13: Optimize Global +- **Setup**: 3 chunks, 2 anchored, 1 merged +- **Input**: optimize_global(flight_id, iterations=50) +- **Expected**: + - All chunks optimized together + - Global consistency achieved + - Returns OptimizationResult with all frame IDs + +### Test Case 14: Multi-Flight Isolation +- **Setup**: 2 flights processing simultaneously +- **Input**: Add factors to both flights +- **Expected**: + - Each flight's graph isolated + - No cross-contamination + - Independent optimization results + +### Test Case 15: Delete Flight Graph +- **Setup**: Flight with complex trajectory and chunks +- **Input**: delete_flight_graph(flight_id) +- **Expected**: + - Returns True + - All resources cleaned up + - No memory leaks + ## Expected Output For each test case: @@ -128,11 +203,35 @@ For each test case: - Each incremental update completes in <100ms - Final trajectory matches batch optimization (within 5m) +**Test Cases 7-9 (Chunk Creation & Factors)**: +- Chunk subgraph created successfully +- Factors added to correct chunk +- Chunk anchor enables merging + +**Test Cases 10-11 (Chunk Optimization & Merging)**: +- Chunk optimizes independently +- Sim(3) transform applied correctly +- Merged trajectory globally consistent + +**Test Cases 12-13 (Chunk Queries & Global)**: +- Chunk trajectory retrieved correctly +- Global optimization handles all chunks + +**Test Cases 14-15 (Isolation & Cleanup)**: +- Multi-flight isolation maintained +- Resource cleanup complete + ## Maximum Expected Time - **Small graph (10 poses)**: < 500ms - **Medium graph (30 poses)**: < 1000ms - **Incremental update**: < 100ms per new pose -- **Total test suite**: < 30 seconds +- **Create chunk subgraph**: < 10ms +- **Add factor to chunk**: < 5ms +- **Add chunk anchor**: < 50ms +- **Optimize chunk (10 frames)**: < 100ms +- **Merge chunks**: < 200ms +- **Optimize global (50 frames, 3 chunks)**: < 500ms +- **Total test suite**: < 60 seconds ## Test Execution Steps @@ -173,14 +272,17 @@ For each test case: ## Pass/Fail Criteria **Overall Test Passes If**: -- At least 5 out of 6 test cases meet their individual success criteria +- At least 12 out of 15 test cases meet their individual success criteria - Test Case 4 (Baseline) must pass (validates AC-1, AC-2, AC-10) +- Test Cases 7-11 (Chunk operations) must pass (validates multi-chunk architecture) - No crashes or numerical instabilities - Memory usage remains stable **Test Fails If**: - Test Case 4 fails to meet AC-1, AC-2, or AC-10 -- More than 1 test case fails completely +- Chunk creation/merging fails +- Multi-flight isolation violated +- More than 3 test cases fail completely - Optimizer produces NaN or infinite values - Processing time exceeds 2x maximum expected time - Memory leak detected (>500MB growth) diff --git a/docs/03_tests/11_flight_manager_integration_spec.md b/docs/03_tests/11_flight_manager_integration_spec.md deleted file mode 100644 index cc760ed..0000000 --- a/docs/03_tests/11_flight_manager_integration_spec.md +++ /dev/null @@ -1,395 +0,0 @@ -# Integration Test: Flight Manager - -## Summary -Validate the Flight Manager component responsible for managing flight sessions, coordinating image processing, and tracking flight state throughout the ASTRAL-Next pipeline. - -## Component Under Test -**Component**: Flight Manager -**Location**: `gps_denied_02_flight_manager` -**Dependencies**: -- Database Layer (flight persistence) -- Image Input Pipeline -- Sequential Visual Odometry (L1) -- Global Place Recognition (L2) -- Metric Refinement (L3) -- Factor Graph Optimizer -- Failure Recovery Coordinator -- Result Manager - -## Detailed Description -This test validates that the Flight Manager can: -1. Create and initialize new flight sessions -2. Manage flight lifecycle (created → processing → completed → archived) -3. Queue and dispatch images for processing -4. Coordinate between all processing layers (L1, L2, L3, Factor Graph) -5. Track processing progress and statistics -6. Handle processing failures and recovery -7. Manage concurrent flights -8. Persist flight state across system restarts -9. Enforce constraints (image ordering, missing frames) -10. Trigger user input requests when automated processing fails - -The Flight Manager is the central orchestrator coordinating all components in the ASTRAL-Next system. - -## Input Data - -### Test Case 1: Create New Flight -- **Input**: - - Flight name: "Test_Baseline_Flight" - - Start GPS: 48.275292, 37.385220 - - Altitude: 400m - - Camera params: focal_length=25mm, sensor_width=23.5mm, resolution=6252x4168 -- **Expected**: Flight created with unique ID, state = "created" - -### Test Case 2: Add Images to Flight -- **Flight**: Test_Baseline_Flight -- **Images**: AD000001-AD000010 (10 images in order) -- **Expected**: All images queued, sequence maintained - -### Test Case 3: Process Flight (Normal) -- **Flight**: Test_Baseline_Flight with AD000001-AD000010 -- **Expected**: - - State transitions: created → processing → completed - - All 10 images processed successfully - - Results available - -### Test Case 4: Process with Sharp Turn -- **Flight**: Test_Sharp_Turn -- **Images**: AD000042, AD000044, AD000045, AD000046 (skip AD000043) -- **Expected**: - - Detect missing frame AD000043 - - L1 tracking fails, L2 recovers location - - Flight completes successfully - -### Test Case 5: Process with Outlier -- **Flight**: Test_Outlier -- **Images**: AD000045-AD000050 (includes 268m outlier) -- **Expected**: - - Outlier detected by Factor Graph - - Robust cost function handles outlier - - Other images processed correctly - -### Test Case 6: Process Long Flight -- **Flight**: Test_Long_Flight -- **Images**: All 60 images (AD000001-AD000060) -- **Expected**: - - Processing completes without failure - - Registration rate > 95% (AC-9) - - Accuracy targets met (AC-1, AC-2) - -### Test Case 7: Handle Processing Failure -- **Flight**: Test_Failure -- **Images**: AD000001-AD000005 -- **Scenario**: Simulate L1, L2, L3 all failing for AD000003 -- **Expected**: - - Failure detected - - Failure Recovery Coordinator invoked - - User input requested (AC-6) - - Flight state = "awaiting_user_input" - -### Test Case 8: Apply User Fix -- **Flight**: Test_Failure (from Test Case 7) -- **User Input**: GPS for AD000003 = 48.274520, 37.381657 -- **Expected**: - - User fix accepted - - Processing resumes - - Factor Graph incorporates fix - - Flight completes - -### Test Case 9: Concurrent Flights -- **Flights**: 3 flights processing simultaneously - - Flight A: AD000001-AD000020 - - Flight B: AD000021-AD000040 - - Flight C: AD000041-AD000060 -- **Expected**: - - All 3 flights process without interference - - No resource contention issues - - All complete successfully - -### Test Case 10: Flight State Persistence -- **Scenario**: - - Start flight with AD000001-AD000030 - - Process 15 images - - Simulate system restart - - Reload flight state - - Continue processing remaining images -- **Expected**: Flight resumes from last checkpoint - -### Test Case 11: Get Flight Statistics -- **Flight**: Completed flight with 60 images -- **Expected Statistics**: - - total_images: 60 - - processed_images: 60 - - failed_images: 0-2 - - success_rate: > 0.95 - - mean_error_m: < 30 - - processing_time_s: < 300 - - registration_rate: > 0.95 - -### Test Case 12: Archive Flight -- **Flight**: Completed flight -- **Expected**: - - Flight state = "archived" - - Results still accessible - - No longer in active processing queue - -## Expected Output - -For each test case: -```json -{ - "flight_id": "unique_flight_identifier", - "flight_name": "string", - "state": "created|processing|completed|failed|awaiting_user_input|archived", - "created_at": "timestamp", - "updated_at": "timestamp", - "statistics": { - "total_images": , - "processed_images": , - "failed_images": , - "awaiting_user_input": , - "success_rate": , - "mean_error_m": , - "registration_rate": , - "processing_time_s": - }, - "current_image": "string|null", - "next_action": "process_next|await_user_input|complete|none" -} -``` - -## Success Criteria - -**Test Case 1 (Create)**: -- Flight created successfully -- Unique flight_id assigned -- State = "created" -- All parameters stored correctly - -**Test Case 2 (Add Images)**: -- All 10 images queued -- Sequence numbers assigned (1-10) -- No duplicates - -**Test Case 3 (Process Normal)**: -- All images processed -- State = "completed" -- success_rate = 1.0 -- Processing time < 60 seconds (10 images) - -**Test Case 4 (Sharp Turn)**: -- Missing frame detected -- L2 successfully recovers location for AD000044 -- success_rate ≥ 0.75 (3/4 or better) - -**Test Case 5 (Outlier)**: -- Outlier detected and handled -- success_rate ≥ 0.83 (5/6 or better) -- Non-outlier images have low error - -**Test Case 6 (Long Flight)**: -- All 60 images processed -- registration_rate > 0.95 (AC-9) -- success_rate > 0.80 -- Processing time < 300 seconds (< 5s per image, AC-7) - -**Test Case 7 (Failure)**: -- Failure detected for AD000003 -- State transitions to "awaiting_user_input" -- User notification sent via SSE -- Processing paused appropriately - -**Test Case 8 (User Fix)**: -- User fix accepted -- Processing resumes automatically -- State transitions back to "processing" -- Flight completes successfully - -**Test Case 9 (Concurrent)**: -- All 3 flights complete -- No race conditions -- No resource exhaustion -- Each flight independent - -**Test Case 10 (Persistence)**: -- State saved correctly -- Reloaded state matches pre-restart -- Processing continues from checkpoint -- No image reprocessing - -**Test Case 11 (Statistics)**: -- All statistics calculated correctly -- Statistics updated in real-time -- Match actual processing results - -**Test Case 12 (Archive)**: -- State = "archived" -- Flight no longer active -- Results preserved - -## Maximum Expected Time -- **Create flight**: < 500ms -- **Add 10 images**: < 2 seconds -- **Process 10 images**: < 60 seconds -- **Process 60 images**: < 300 seconds (5s per image, AC-7) -- **Apply user fix**: < 1 second -- **Get statistics**: < 200ms -- **Total test suite**: < 600 seconds (10 minutes) - -## Test Execution Steps - -1. **Setup Phase**: - a. Initialize Flight Manager - b. Clear any existing test flights from database - c. Prepare test images - d. Configure processing parameters - -2. **Test Case 1 - Create**: - a. Call create_flight() with parameters - b. Verify flight_id returned - c. Check database for flight record - d. Validate initial state - -3. **Test Case 2 - Add Images**: - a. Call add_images() with 10 images - b. Verify all queued - c. Check sequence assignment - d. Validate database state - -4. **Test Case 3 - Process Normal**: - a. Call start_processing() - b. Monitor state transitions - c. Wait for completion - d. Verify results - -5. **Test Case 4 - Sharp Turn**: - a. Create flight with gap in sequence - b. Start processing - c. Monitor L1 failure, L2 recovery - d. Verify completion - -6. **Test Case 5 - Outlier**: - a. Process flight with outlier - b. Monitor Factor Graph handling - c. Verify outlier detection - d. Check final results - -7. **Test Case 6 - Long Flight**: - a. Process all 60 images - b. Monitor progress continuously - c. Measure processing time - d. Validate against AC-1, AC-2, AC-7, AC-9 - -8. **Test Case 7 - Failure**: - a. Simulate triple failure for one image - b. Verify failure detection - c. Check state transition - d. Confirm user input request - -9. **Test Case 8 - User Fix**: - a. Submit user fix - b. Verify acceptance - c. Monitor processing resume - d. Check incorporation into results - -10. **Test Case 9 - Concurrent**: - a. Start 3 flights simultaneously - b. Monitor all 3 in parallel - c. Verify isolation - d. Wait for all completions - -11. **Test Case 10 - Persistence**: - a. Start long flight - b. Save state mid-processing - c. Simulate restart (reload from DB) - d. Continue and complete - -12. **Test Case 11 - Statistics**: - a. Query statistics after completion - b. Validate calculations - c. Compare with ground truth - d. Check all metrics present - -13. **Test Case 12 - Archive**: - a. Archive completed flight - b. Verify state change - c. Check accessibility - d. Ensure not in active queue - -## Pass/Fail Criteria - -**Overall Test Passes If**: -- All 12 test cases meet their success criteria -- Flight state machine transitions correctly -- All images processed (or user input requested) -- No data loss or corruption -- Concurrent flights isolated -- State persistence works correctly - -**Test Fails If**: -- Any flight enters invalid state -- Images processed out of order -- Duplicate processing occurs -- Statistics incorrect -- Race conditions in concurrent processing -- State persistence fails -- Memory or resource leaks - -## Additional Validation - -**State Machine Validation**: -Valid state transitions: -- created → processing -- processing → completed -- processing → awaiting_user_input -- awaiting_user_input → processing -- processing → failed -- completed → archived -- failed → archived - -Invalid transitions should be rejected. - -**Image Queue Management**: -- FIFO processing order -- No image skipped (unless failure) -- Sequence numbers maintained -- Duplicate detection - -**Resource Management**: -- Memory usage bounded per flight -- No orphaned resources after completion -- Cleanup on flight deletion -- Limits on concurrent flights (if configured) - -**Error Recovery**: -- Graceful handling of component failures -- Retry logic for transient errors -- User intervention for persistent failures -- Clear error messages - -**Integration with Acceptance Criteria**: -- **AC-6**: User input mechanism tested (Test Cases 7, 8) -- **AC-7**: Processing time < 5s per image (Test Case 6) -- **AC-8**: Real-time results via SSE (monitored during processing) -- **AC-9**: Registration rate > 95% (Test Case 6) - -**Performance Metrics**: -- Throughput: images processed per second -- Latency: time from image queued to result available -- Overhead: Flight Manager processing vs actual vision processing -- Scalability: performance with 1, 10, 100 flights - -##Database Operations**: -- Atomic transactions for state changes -- Proper indexing for queries -- No deadlocks in concurrent operations -- Backup and recovery procedures - -**Configuration Options**: -Test various configuration: -- Max concurrent images processing -- Retry attempts for failed processing -- Timeout values -- Buffer sizes -- Checkpoint frequency for persistence - diff --git a/docs/03_tests/11a_flight_lifecycle_manager_integration_spec.md b/docs/03_tests/11a_flight_lifecycle_manager_integration_spec.md new file mode 100644 index 0000000..8e0bc12 --- /dev/null +++ b/docs/03_tests/11a_flight_lifecycle_manager_integration_spec.md @@ -0,0 +1,194 @@ +# Integration Test: Flight Lifecycle Manager (F02.1) + +## Summary +Validate the Flight Lifecycle Manager component responsible for flight CRUD operations, system initialization, and API request routing. + +## Component Under Test +**Component**: Flight Lifecycle Manager (F02.1) +**Interface**: `IFlightLifecycleManager` +**Dependencies**: +- F03 Flight Database (persistence) +- F04 Satellite Data Manager (prefetching) +- F05 Image Input Pipeline (image queuing) +- F13 Coordinate Transformer (ENU origin) +- F15 SSE Event Streamer (stream creation) +- F16 Model Manager (model loading) +- F17 Configuration Manager (config loading) +- F02.2 Flight Processing Engine (managed child) + +## Detailed Description +This test validates that the Flight Lifecycle Manager can: +1. Create and initialize new flight sessions +2. Manage flight lifecycle (created → active → completed) +3. Validate waypoints and geofences +4. Queue images for processing (delegates to F05, triggers F02.2) +5. Handle user fix requests (delegates to F02.2) +6. Create SSE client streams (delegates to F15) +7. Initialize system components on startup +8. Manage F02.2 Processing Engine instances per flight + +The Lifecycle Manager is the external-facing component handling API requests and delegating to internal processing. + +## Input Data + +### Test Case 1: Create New Flight +- **Input**: + - Flight name: "Test_Baseline_Flight" + - Start GPS: 48.275292, 37.385220 + - Altitude: 400m + - Camera params: focal_length=25mm, sensor_width=23.5mm, resolution=6252x4168 +- **Expected**: + - Flight created with unique ID + - F13.set_enu_origin() called with start_gps + - F04.prefetch_route_corridor() triggered + - Flight persisted to F03 + +### Test Case 2: Get Flight +- **Input**: Existing flight_id +- **Expected**: Flight data returned with current state + +### Test Case 3: Get Flight State +- **Input**: Existing flight_id +- **Expected**: FlightState returned (processing status, current frame, etc.) + +### Test Case 4: Delete Flight +- **Input**: Existing flight_id +- **Expected**: + - Flight marked deleted in F03 + - Associated F02.2 engine stopped + - Resources cleaned up + +### Test Case 5: Update Flight Status +- **Input**: flight_id, status update (e.g., pause, resume) +- **Expected**: Status updated, F02.2 notified if needed + +### Test Case 6: Update Single Waypoint +- **Input**: flight_id, waypoint_id, new Waypoint data +- **Expected**: Waypoint updated in F03 + +### Test Case 7: Batch Update Waypoints +- **Input**: flight_id, List of updated Waypoints +- **Expected**: All waypoints updated atomically + +### Test Case 8: Validate Waypoint +- **Input**: Waypoint with GPS coordinates +- **Expected**: ValidationResult with valid/invalid and reason + +### Test Case 9: Validate Geofence +- **Input**: Geofence polygon +- **Expected**: ValidationResult (valid polygon, within limits) + +### Test Case 10: Queue Images (Delegation) +- **Input**: flight_id, ImageBatch (10 images) +- **Expected**: + - F05.queue_batch() called + - F02.2 engine started/triggered + - BatchQueueResult returned + +### Test Case 11: Handle User Fix (Delegation) +- **Input**: flight_id, UserFixRequest (frame_id, GPS anchor) +- **Expected**: + - Active F02.2 engine retrieved + - engine.apply_user_fix() called + - UserFixResult returned + +### Test Case 12: Create Client Stream (Delegation) +- **Input**: flight_id, client_id +- **Expected**: + - F15.create_stream() called + - StreamConnection returned + +### Test Case 13: Convert Object to GPS (Delegation) +- **Input**: flight_id, frame_id, pixel coordinates +- **Expected**: + - F13.image_object_to_gps() called + - GPSPoint returned + +### Test Case 14: System Initialization +- **Input**: Call initialize_system() +- **Expected**: + - F17.load_config() called + - F16.load_model() called for all models + - F03 database initialized + - F04 cache initialized + - F08 index loaded + - Returns True on success + +### Test Case 15: Get Flight Metadata +- **Input**: flight_id +- **Expected**: FlightMetadata (camera params, altitude, waypoint count, etc.) + +### Test Case 16: Validate Flight Continuity +- **Input**: List of Waypoints +- **Expected**: ValidationResult (continuous path, no gaps > threshold) + +## Expected Output + +For each test case: +```json +{ + "flight_id": "unique_flight_identifier", + "flight_name": "string", + "state": "created|active|completed|paused|deleted", + "created_at": "timestamp", + "updated_at": "timestamp", + "enu_origin": { + "latitude": , + "longitude": + }, + "waypoint_count": , + "has_active_engine": +} +``` + +## Success Criteria + +**Test Cases 1-5 (Flight CRUD)**: +- Flight created/retrieved/updated/deleted correctly +- State transitions valid +- Database persistence verified + +**Test Cases 6-9 (Validation)**: +- Waypoint/geofence validation correct +- Invalid inputs rejected with reason +- Edge cases handled + +**Test Cases 10-13 (Delegation)**: +- Correct components called +- Parameters passed correctly +- Results returned correctly + +**Test Case 14 (Initialization)**: +- All components initialized in correct order +- Failures handled gracefully +- Startup time < 30 seconds + +**Test Cases 15-16 (Metadata/Continuity)**: +- Metadata accurate +- Continuity validation correct + +## Maximum Expected Time +- Create flight: < 500ms (excluding prefetch) +- Get/Update flight: < 100ms +- Delete flight: < 500ms +- Queue images: < 2 seconds (10 images) +- User fix delegation: < 100ms +- System initialization: < 30 seconds +- Total test suite: < 120 seconds + +## Pass/Fail Criteria + +**Overall Test Passes If**: +- All 16 test cases pass +- CRUD operations work correctly +- Delegation to child components works +- System initialization completes +- No resource leaks + +**Test Fails If**: +- Flight CRUD fails +- Delegation fails to reach correct component +- System initialization fails +- Invalid state transitions allowed +- Resource cleanup fails on delete + diff --git a/docs/03_tests/11b_flight_processing_engine_integration_spec.md b/docs/03_tests/11b_flight_processing_engine_integration_spec.md new file mode 100644 index 0000000..5f82f9d --- /dev/null +++ b/docs/03_tests/11b_flight_processing_engine_integration_spec.md @@ -0,0 +1,241 @@ +# Integration Test: Flight Processing Engine (F02.2) + +## Summary +Validate the Flight Processing Engine component responsible for the main processing loop, frame-by-frame orchestration, recovery coordination, and chunk management. + +## Component Under Test +**Component**: Flight Processing Engine (F02.2) +**Interface**: `IFlightProcessingEngine` +**Dependencies**: +- F05 Image Input Pipeline (image source) +- F06 Image Rotation Manager (pre-processing) +- F07 Sequential Visual Odometry (motion estimation) +- F09 Metric Refinement (satellite alignment) +- F10 Factor Graph Optimizer (state estimation) +- F11 Failure Recovery Coordinator (recovery logic) +- F12 Route Chunk Manager (chunk state) +- F14 Result Manager (saving results) +- F15 SSE Event Streamer (real-time updates) + +## Detailed Description +This test validates that the Flight Processing Engine can: +1. Run the main processing loop (Image → VO → Graph → Result) +2. Manage flight status (Processing, Blocked, Recovering, Completed) +3. Coordinate chunk lifecycle with F12 +4. Handle tracking loss and delegate to F11 +5. Apply user fixes and resume processing +6. Publish results via F14/F15 +7. Manage background chunk matching tasks +8. Handle concurrent processing gracefully + +The Processing Engine runs in a background thread per active flight. + +## Input Data + +### Test Case 1: Start Processing +- **Flight**: Test_Baseline_Flight with 10 queued images +- **Action**: Call start_processing(flight_id) +- **Expected**: + - Background processing thread started + - First image retrieved from F05 + - Processing loop begins + +### Test Case 2: Stop Processing +- **Flight**: Active flight with processing in progress +- **Action**: Call stop_processing(flight_id) +- **Expected**: + - Processing loop stopped gracefully + - Current frame completed or cancelled + - State saved + +### Test Case 3: Process Single Frame (Normal) +- **Input**: Single frame with good tracking +- **Expected**: + - F06.requires_rotation_sweep() checked + - F07.compute_relative_pose() called + - F12.add_frame_to_chunk() called + - F10.add_relative_factor() called + - F10.optimize_chunk() called + - F14.update_frame_result() called + - SSE event sent + +### Test Case 4: Process Frame (First Frame / Sharp Turn) +- **Input**: First frame or frame after sharp turn +- **Expected**: + - F06.requires_rotation_sweep() returns True + - F06.rotate_image_360() called (12 rotations) + - F09.align_to_satellite() called for each rotation + - Best rotation selected + - Heading updated + +### Test Case 5: Process Frame (Tracking Lost) +- **Input**: Frame with low VO confidence +- **Expected**: + - F11.check_confidence() returns LOST + - F11.create_chunk_on_tracking_loss() called + - New chunk created proactively + - handle_tracking_loss() invoked + +### Test Case 6: Handle Tracking Loss (Progressive Search) +- **Input**: Frame with tracking lost, recoverable +- **Expected**: + - F11.start_search() called + - F11.try_current_grid() called iteratively + - Grid expansion (1→4→9→16→25) + - Match found, F11.mark_found() called + - Processing continues + +### Test Case 7: Handle Tracking Loss (User Input Needed) +- **Input**: Frame with tracking lost, not recoverable +- **Expected**: + - Progressive search exhausted (25 tiles) + - F11.create_user_input_request() called + - Engine receives UserInputRequest + - F15.send_user_input_request() called + - Status set to BLOCKED + - Processing paused + +### Test Case 8: Apply User Fix +- **Input**: UserFixRequest with GPS anchor +- **Action**: Call apply_user_fix(flight_id, fix_data) +- **Expected**: + - F11.apply_user_anchor() called + - Anchor applied to factor graph + - Status set to PROCESSING + - Processing loop resumes + +### Test Case 9: Get Active Chunk +- **Flight**: Active flight with chunks +- **Action**: Call get_active_chunk(flight_id) +- **Expected**: + - F12.get_active_chunk() called + - Returns current active chunk or None + +### Test Case 10: Create New Chunk +- **Input**: Tracking loss detected +- **Action**: Call create_new_chunk(flight_id, frame_id) +- **Expected**: + - F12.create_chunk() called + - New chunk created in factor graph + - Returns ChunkHandle + +### Test Case 11: Process Flight (Full - Normal) +- **Flight**: 30 images (AD000001-030) +- **Expected**: + - All 30 images processed + - Status transitions: Processing → Completed + - Results published for all frames + - Processing time < 150 seconds (5s per image) + +### Test Case 12: Process Flight (With Sharp Turn) +- **Flight**: AD000042, AD000044, AD000045, AD000046 (skip AD000043) +- **Expected**: + - Tracking lost at AD000044 + - New chunk created + - Recovery succeeds (L2/L3) + - Flight completes + +### Test Case 13: Process Flight (With Outlier) +- **Flight**: AD000045-050 (includes 268m outlier) +- **Expected**: + - Outlier detected by factor graph + - Robust kernel handles outlier + - Other images processed correctly + +### Test Case 14: Process Flight (Long) +- **Flight**: All 60 images (AD000001-060) +- **Expected**: + - Processing completes + - Registration rate > 95% (AC-9) + - Processing time < 300 seconds (AC-7) + +### Test Case 15: Background Chunk Matching +- **Flight**: Flight with multiple unanchored chunks +- **Expected**: + - Background task processes chunks + - F11.process_unanchored_chunks() called periodically + - Chunks matched and merged asynchronously + - Frame processing not blocked + +### Test Case 16: State Persistence and Recovery +- **Scenario**: + - Process 15 frames + - Simulate restart + - Resume processing +- **Expected**: + - State saved to F03 before restart + - State restored on resume + - Processing continues from frame 16 + +## Expected Output + +For each frame processed: +```json +{ + "flight_id": "string", + "frame_id": , + "status": "processed|failed|skipped|blocked", + "gps": { + "latitude": , + "longitude": + }, + "confidence": , + "chunk_id": "string", + "processing_time_ms": +} +``` + +## Success Criteria + +**Test Cases 1-2 (Start/Stop)**: +- Processing starts/stops correctly +- No resource leaks +- Graceful shutdown + +**Test Cases 3-5 (Frame Processing)**: +- Correct components called in order +- State updates correctly +- Results published + +**Test Cases 6-8 (Recovery)**: +- Progressive search works +- User input flow works +- Recovery successful + +**Test Cases 9-10 (Chunk Management)**: +- Chunks created/managed correctly +- F12 integration works + +**Test Cases 11-14 (Full Flights)**: +- All acceptance criteria met +- Processing completes successfully + +**Test Cases 15-16 (Background/Recovery)**: +- Background tasks work +- State persistence works + +## Maximum Expected Time +- Start/stop processing: < 500ms +- Process single frame: < 5 seconds (AC-7) +- Handle tracking loss: < 2 seconds +- Apply user fix: < 1 second +- Process 30 images: < 150 seconds +- Process 60 images: < 300 seconds +- Total test suite: < 600 seconds + +## Pass/Fail Criteria + +**Overall Test Passes If**: +- All 16 test cases pass +- Processing loop works correctly +- Recovery mechanisms work +- Chunk management works +- Performance targets met + +**Test Fails If**: +- Processing loop crashes +- Recovery fails when it should succeed +- User input not requested when needed +- Performance exceeds 5s per image +- State persistence fails + diff --git a/docs/03_tests/14_failure_recovery_coordinator_integration_spec.md b/docs/03_tests/14_failure_recovery_coordinator_integration_spec.md index ac5d75c..0c17028 100644 --- a/docs/03_tests/14_failure_recovery_coordinator_integration_spec.md +++ b/docs/03_tests/14_failure_recovery_coordinator_integration_spec.md @@ -1,93 +1,308 @@ -# Integration Test: Failure Recovery Coordinator +# Integration Test: Failure Recovery Coordinator (F11) ## Summary -Validate the Failure Recovery Coordinator that detects processing failures and coordinates recovery strategies including user intervention per AC-6. +Validate the Failure Recovery Coordinator that detects processing failures and coordinates recovery strategies. F11 is a pure logic component that returns status objects - it does NOT directly emit events or communicate with clients. ## Component Under Test -**Component**: Failure Recovery Coordinator -**Location**: `gps_denied_11_failure_recovery_coordinator` -**Dependencies**: Flight Manager, All vision layers (L1, L2, L3), Factor Graph Optimizer, SSE Event Streamer +**Component**: Failure Recovery Coordinator (F11) +**Interface**: `IFailureRecoveryCoordinator` +**Dependencies**: +- F04 Satellite Data Manager (search grids) +- F06 Image Rotation Manager (rotation sweeps) +- F08 Global Place Recognition (candidate retrieval) +- F09 Metric Refinement (alignment) +- F10 Factor Graph Optimizer (anchor application) +- F12 Route Chunk Manager (chunk operations) + +## Architecture Pattern +**Pure Logic Component**: F11 coordinates recovery strategies but delegates execution and communication. +- **NO Events**: Returns status objects or booleans +- **Caller Responsibility**: F02.2 decides state transitions based on F11 returns +- **Chunk Orchestration**: Coordinates F12 and F10 operations during recovery ## Detailed Description Per AC-6: "In case of being absolutely incapable of determining the system to determine next, second next, and third next images GPS, by any means (these 20% of the route), then it should ask the user for input for the next image." Tests that the coordinator can: -1. Detect when L1, L2, and L3 all fail for an image -2. Track consecutive failures (next, second next, third next) -3. Request user input after 3 consecutive failures -4. Apply user-provided GPS fixes -5. Resume processing after user intervention -6. Handle user input rejection/timeout -7. Provide failure diagnostics to user +1. Assess tracking confidence from VO and LiteSAM results +2. Detect tracking loss conditions +3. Coordinate progressive tile search (1→4→9→16→25) +4. Create user input request objects (NOT send them) +5. Apply user-provided GPS anchors +6. Proactively create chunks on tracking loss +7. Coordinate chunk semantic matching +8. Coordinate chunk LiteSAM matching with rotation sweeps +9. Merge chunks to main trajectory ## Input Data -### Test Case 1: Single Image Failure (L1 only) -- **Images**: AD000001-AD000005 -- **Simulate**: L1 fails for AD000003, L2 succeeds -- **Expected**: L2 recovers, no user input needed +### Test Case 1: Check Confidence (Good) +- **Input**: VO result with 80 inliers, LiteSAM confidence 0.85 +- **Expected**: + - Returns ConfidenceAssessment + - tracking_status = "good" + - overall_confidence > 0.7 -### Test Case 2: Triple Layer Failure (One Image) -- **Images**: AD000001-AD000005 -- **Simulate**: L1, L2, L3 all fail for AD000003 -- **Expected**: Mark as failed, continue to AD000004 +### Test Case 2: Check Confidence (Degraded) +- **Input**: VO result with 35 inliers, LiteSAM confidence 0.5 +- **Expected**: + - Returns ConfidenceAssessment + - tracking_status = "degraded" + - overall_confidence 0.3-0.7 -### Test Case 3: Three Consecutive Failures (AC-6) -- **Images**: AD000001-AD000010 -- **Simulate**: All layers fail for AD000003, AD000004, AD000005 -- **Expected**: After 3rd failure, request user input via SSE +### Test Case 3: Check Confidence (Lost) +- **Input**: VO result with 10 inliers, no LiteSAM result +- **Expected**: + - Returns ConfidenceAssessment + - tracking_status = "lost" + - overall_confidence < 0.3 -### Test Case 4: User Provides Fix -- **Context**: Test Case 3, user input requested -- **User Input**: GPS for AD000005 = 48.273997, 37.379828 -- **Expected**: Fix applied, processing resumes with AD000006 +### Test Case 4: Detect Tracking Loss +- **Input**: ConfidenceAssessment with tracking_status = "lost" +- **Expected**: Returns True (tracking lost) -### Test Case 5: User Input Timeout -- **Context**: User input requested, no response for 5 minutes -- **Expected**: Continue without fix, mark images as user_input_pending +### Test Case 5: Start Search +- **Input**: flight_id, frame_id, estimated_gps +- **Expected**: + - Returns SearchSession + - session.current_grid_size = 1 + - session.found = False + - session.exhausted = False -### Test Case 6: Intermittent Failures -- **Images**: AD000001-AD000020 -- **Simulate**: Failures for AD000003, AD000007, AD000012 (not consecutive) -- **Expected**: No user input requested, each recovered or skipped +### Test Case 6: Expand Search Radius +- **Input**: SearchSession with grid_size = 1 +- **Action**: Call expand_search_radius() +- **Expected**: + - Returns List[TileCoords] (3 new tiles for 2x2 grid) + - session.current_grid_size = 4 -### Test Case 7: Failure After User Fix -- **Context**: User fix applied, next image also fails -- **Expected**: Request another user fix +### Test Case 7: Try Current Grid (Match Found) +- **Input**: SearchSession, tiles dict with matching tile +- **Expected**: + - Returns AlignmentResult + - result.matched = True + - result.gps populated -### Test Case 8: Batch User Input -- **Context**: Multiple failures needing user input -- **Expected**: Request fixes for all failed images, process batch +### Test Case 8: Try Current Grid (No Match) +- **Input**: SearchSession, tiles dict with no matching tile +- **Expected**: + - Returns None + - Caller should call expand_search_radius() + +### Test Case 9: Mark Found +- **Input**: SearchSession, AlignmentResult +- **Expected**: + - Returns True + - session.found = True + +### Test Case 10: Get Search Status +- **Input**: SearchSession +- **Expected**: + - Returns SearchStatus + - Contains current_grid_size, found, exhausted + +### Test Case 11: Create User Input Request +- **Input**: flight_id, frame_id, candidate_tiles +- **Expected**: + - Returns UserInputRequest object (NOT sent) + - Contains request_id, flight_id, frame_id + - Contains uav_image, candidate_tiles, message + - **NOTE**: Caller (F02.2) sends to F15 + +### Test Case 12: Apply User Anchor +- **Input**: flight_id, frame_id, UserAnchor with GPS +- **Expected**: + - Calls F10.add_absolute_factor() with high confidence + - Returns True if successful + - **NOTE**: Caller (F02.2) updates state and publishes result + +### Test Case 13: Create Chunk on Tracking Loss +- **Input**: flight_id, frame_id +- **Expected**: + - Calls F12.create_chunk() + - Returns ChunkHandle + - chunk.is_active = True + - chunk.has_anchor = False + - chunk.matching_status = "unanchored" + +### Test Case 14: Try Chunk Semantic Matching +- **Input**: chunk_id (chunk with 10 frames) +- **Expected**: + - Gets chunk images via F12 + - Calls F08.retrieve_candidate_tiles_for_chunk() + - Returns List[TileCandidate] or None + +### Test Case 15: Try Chunk LiteSAM Matching +- **Input**: chunk_id, candidate_tiles +- **Expected**: + - Gets chunk images via F12 + - Calls F06.try_chunk_rotation_steps() (12 rotations) + - Returns ChunkAlignmentResult or None + - Result contains rotation_angle, chunk_center_gps, transform + +### Test Case 16: Merge Chunk to Trajectory +- **Input**: flight_id, chunk_id, ChunkAlignmentResult +- **Expected**: + - Calls F12.mark_chunk_anchored() + - Calls F12.merge_chunks() + - Returns True if successful + - **NOTE**: Caller (F02.2) coordinates result updates + +### Test Case 17: Process Unanchored Chunks (Logic) +- **Input**: flight_id with 2 unanchored chunks +- **Expected**: + - Calls F12.get_chunks_for_matching() + - For each ready chunk: + - try_chunk_semantic_matching() + - try_chunk_litesam_matching() + - merge_chunk_to_trajectory() if match found + +### Test Case 18: Progressive Search Full Flow +- **Scenario**: + - start_search() → grid_size=1 + - try_current_grid() → None + - expand_search_radius() → grid_size=4 + - try_current_grid() → None + - expand_search_radius() → grid_size=9 + - try_current_grid() → AlignmentResult + - mark_found() → success +- **Expected**: Search succeeds at 3x3 grid + +### Test Case 19: Search Exhaustion Flow +- **Scenario**: + - start_search() + - try all grids: 1→4→9→16→25, all fail + - create_user_input_request() +- **Expected**: + - Returns UserInputRequest + - session.exhausted = True + - **NOTE**: Caller sends request, waits for user fix + +### Test Case 20: Chunk Recovery Full Flow +- **Scenario**: + - create_chunk_on_tracking_loss() → chunk created + - Processing continues in chunk + - try_chunk_semantic_matching() → candidates found + - try_chunk_litesam_matching() → match at 90° rotation + - merge_chunk_to_trajectory() → success +- **Expected**: Chunk anchored and merged without user input ## Expected Output + +### ConfidenceAssessment ```json { - "failure_detected": true, - "failed_image_id": "AD000003", - "failure_layers": ["L1", "L2", "L3"], - "consecutive_failures": 3, - "action": "request_user_input|skip|continue", - "user_input_requested": true, - "user_fix_received": true/false, - "recovery_strategy": "string", - "timestamp": "ISO8601" + "overall_confidence": 0.85, + "vo_confidence": 0.9, + "litesam_confidence": 0.8, + "inlier_count": 80, + "tracking_status": "good|degraded|lost" +} +``` + +### SearchSession +```json +{ + "session_id": "string", + "flight_id": "string", + "frame_id": 42, + "center_gps": {"latitude": 48.275, "longitude": 37.385}, + "current_grid_size": 4, + "max_grid_size": 25, + "found": false, + "exhausted": false +} +``` + +### UserInputRequest +```json +{ + "request_id": "string", + "flight_id": "string", + "frame_id": 42, + "candidate_tiles": [...], + "message": "Please provide GPS location for this frame" +} +``` + +### ChunkAlignmentResult +```json +{ + "matched": true, + "chunk_id": "string", + "chunk_center_gps": {"latitude": 48.275, "longitude": 37.385}, + "rotation_angle": 90.0, + "confidence": 0.85, + "inlier_count": 150, + "transform": {...} } ``` ## Success Criteria -- **Test Cases 1-2**: Failures handled, no user input -- **Test Case 3**: User input requested after 3rd consecutive failure (AC-6) -- **Test Case 4**: User fix applied, processing continues -- **Test Case 5**: Timeout handled gracefully -- **Test Cases 6-8**: Appropriate recovery strategies applied + +**Test Cases 1-4 (Confidence)**: +- Confidence assessment accurate +- Thresholds correctly applied +- Tracking loss detected correctly + +**Test Cases 5-10 (Progressive Search)**: +- Search session management works +- Grid expansion correct (1→4→9→16→25) +- Match detection works + +**Test Cases 11-12 (User Input)**: +- UserInputRequest object created correctly (not sent) +- User anchor applied correctly + +**Test Cases 13-17 (Chunk Recovery)**: +- Proactive chunk creation works +- Chunk semantic matching works +- Chunk LiteSAM matching with rotation works +- Chunk merging works + +**Test Cases 18-20 (Full Flows)**: +- Progressive search flow completes +- Search exhaustion flow completes +- Chunk recovery flow completes ## Maximum Expected Time -- Failure detection: < 100ms -- User input request: < 500ms -- Apply user fix: < 1 second +- check_confidence: < 10ms +- detect_tracking_loss: < 5ms +- Progressive search (25 tiles): < 1.5s total +- create_user_input_request: < 100ms +- apply_user_anchor: < 500ms +- Chunk semantic matching: < 2s +- Chunk LiteSAM matching (12 rotations): < 10s - Total test suite: < 120 seconds ## Pass/Fail Criteria -**Passes If**: AC-6 requirement met (user input after 3 consecutive failures), all recovery strategies work -**Fails If**: User input not requested when needed, or processing deadlocks +**Overall Test Passes If**: +- All 20 test cases pass +- Confidence assessment accurate +- Progressive search works +- User input request created correctly (not sent) +- Chunk recovery works +- No direct event emission (pure logic) + +**Test Fails If**: +- Tracking loss not detected when should be +- Progressive search fails to expand correctly +- User input request not created when needed +- F11 directly emits events (violates architecture) +- Chunk recovery fails +- Performance exceeds targets + +## Architecture Validation + +**F11 Must NOT**: +- Call F15 directly (SSE events) +- Emit events to clients +- Manage processing state +- Control processing flow + +**F11 Must**: +- Return status objects for all operations +- Let caller (F02.2) decide next actions +- Coordinate with F10, F12 for chunk operations +- Be testable in isolation (no I/O dependencies) diff --git a/docs/03_tests/39_route_chunk_connection_spec.md b/docs/03_tests/39_route_chunk_connection_spec.md index 96c993a..bee3074 100644 --- a/docs/03_tests/39_route_chunk_connection_spec.md +++ b/docs/03_tests/39_route_chunk_connection_spec.md @@ -8,12 +8,14 @@ Validate Acceptance Criterion 5 (partial): "System should try to operate when UA ## Preconditions 1. System with "Atlas" multi-map capability (factor graph with native chunk support) -2. F12 Route Chunk Manager functional -3. F10 Factor Graph Optimizer with multi-chunk support -4. L2 global place recognition functional (chunk semantic matching) -5. L3 metric refinement functional (chunk LiteSAM matching) -6. Geodetic map-merging logic implemented (Sim(3) transform) -7. Test dataset: Simulate 3 disconnected route fragments +2. F02.2 Flight Processing Engine running +3. F11 Failure Recovery Coordinator (chunk orchestration) +4. F12 Route Chunk Manager functional (chunk lifecycle) +5. F10 Factor Graph Optimizer with multi-chunk support (subgraph operations) +6. F08 Global Place Recognition (chunk semantic matching via `retrieve_candidate_tiles_for_chunk()`) +7. F09 Metric Refinement (chunk LiteSAM matching) +8. Geodetic map-merging logic implemented (Sim(3) transform via F10.merge_chunk_subgraphs()) +9. Test dataset: Simulate 3 disconnected route fragments ## Test Description Test system's ability to handle completely disconnected route segments (no overlap between segments) and eventually connect them into a coherent trajectory using global GPS anchors. @@ -141,24 +143,24 @@ Processing Mode: Multi-Map Atlas ## Architecture Elements **Multi-Map "Atlas"** (per solution document): -- Each disconnected segment gets own local map -- Local maps independently optimized -- GPS anchors provide global reference -- Geodetic merging aligns all maps +- Each disconnected segment gets own local map via F12.create_chunk() +- Local maps independently optimized via F10.optimize_chunk() +- GPS anchors provide global reference via F10.add_chunk_anchor() +- Geodetic merging aligns all maps via F10.merge_chunk_subgraphs() **Recovery Mechanisms**: -- **Proactive chunk creation** on tracking loss (immediate, not reactive) -- Chunk semantic matching (aggregate DINOv2) finds location for chunk -- Chunk LiteSAM matching (with rotation sweeps) refines GPS anchor -- Factor graph creates new chunk subgraph -- Sim(3) transform merges chunks into global trajectory +- **Proactive chunk creation** via F11.create_chunk_on_tracking_loss() (immediate, not reactive) +- Chunk semantic matching via F08.retrieve_candidate_tiles_for_chunk() (aggregate DINOv2) +- Chunk LiteSAM matching via F06.try_chunk_rotation_steps() + F09.align_chunk_to_satellite() +- F10 creates new chunk subgraph +- Sim(3) transform merges chunks via F12.merge_chunks() → F10.merge_chunk_subgraphs() **Fragment Detection**: - Large displacement (> 500m) from last image -- Low/zero overlap +- Low/zero overlap (F07 VO fails) - L1 failure triggers **proactive** new chunk creation - Chunks processed independently with local optimization -- Multiple chunks can exist simultaneously +- Multiple chunks can exist simultaneously (F10 supports multi-chunk factor graph) ## Notes - AC-5 describes realistic operational scenario (multiple turns, disconnected segments) diff --git a/docs/03_tests/40_user_input_recovery_spec.md b/docs/03_tests/40_user_input_recovery_spec.md index 7ea1520..259ee61 100644 --- a/docs/03_tests/40_user_input_recovery_spec.md +++ b/docs/03_tests/40_user_input_recovery_spec.md @@ -1,42 +1,254 @@ # Acceptance Test: AC-6 - User Input Recovery ## Summary -Validate Acceptance Criterion 6: "In case of being absolutely incapable of determining the system to determine next, second next, and third next images GPS, by any means (these 20% of the route), then it should ask the user for input for the next image." +Validate Acceptance Criterion 6: "In case of being absolutely incapable of determining the system to determine next, second next, and third next images GPS, by any means (these 20% of the route), then it should ask the user for input for the next image, so that the user can specify the location." ## Linked Acceptance Criteria **AC-6**: User input requested after 3 consecutive failures +## Preconditions +1. ASTRAL-Next system operational +2. F11 Failure Recovery Coordinator configured with failure threshold = 3 +3. F15 SSE Event Streamer functional +4. F01 Flight API accepting user-fix endpoint +5. F10 Factor Graph Optimizer ready to accept high-confidence anchors +6. Test environment configured to simulate L1/L2/L3 failures +7. SSE client connected and monitoring events + +## Test Data +- **Dataset**: AD000001-060 (60 images) +- **Failure Injection**: Configure mock failures for specific frames +- **Ground Truth**: coordinates.csv for validation + ## Test Steps -### Step 1: Simulate Triple Failure -- **Action**: Process flight where L1, L2, L3 all fail for AD000003, AD000004, AD000005 -- **Expected Result**: After 3rd consecutive failure, system requests user input via SSE event +### Step 1: Setup Failure Injection +- **Action**: Configure system to fail L1, L2, L3 for frames AD000020, AD000021, AD000022 +- **Expected Result**: + - L1 (SuperPoint+LightGlue): Returns match_count < 10 + - L2 (AnyLoc): Returns confidence < 0.3 + - L3 (LiteSAM): Returns alignment_score < 0.2 -### Step 2: User Receives Notification -- **Action**: SSE client receives "user_input_required" event -- **Expected Result**: Event includes image needing fix (AD000005), top-3 satellite tiles for reference +### Step 2: Process Normal Frames (1-19) +- **Action**: Process AD000001-AD000019 normally +- **Expected Result**: + - All 19 frames processed successfully + - No user input requests + - SSE events: 19 × `frame_processed` -### Step 3: User Provides GPS Fix -- **Action**: User submits GPS for AD000005: POST /flights/{flightId}/user-fix -- **Payload**: `{"frame_id": 5, "uav_pixel": [3126, 2084], "satellite_gps": {"lat": 48.273997, "lon": 37.379828}}` -- **Expected Result**: Fix accepted, processing resumes, SSE event "user_fix_applied" sent +### Step 3: First Consecutive Failure +- **Action**: Process AD000020 +- **Expected Result**: + - L1 fails (low match count) + - L2 fallback fails (low confidence) + - L3 fallback fails (low alignment) + - System increments failure_count to 1 + - SSE event: `frame_processing_failed` with frame_id=20 + - **No user input request yet** -### Step 4: System Incorporates Fix -- **Action**: Factor graph adds user fix as high-confidence GPS anchor -- **Expected Result**: Trajectory refined incorporating user input +### Step 4: Second Consecutive Failure +- **Action**: Process AD000021 +- **Expected Result**: + - All layers fail + - failure_count incremented to 2 + - SSE event: `frame_processing_failed` with frame_id=21 + - **No user input request yet** -### Step 5: Processing Continues -- **Action**: System processes AD000006 and beyond -- **Expected Result**: Processing continues normally +### Step 5: Third Consecutive Failure - Triggers User Input +- **Action**: Process AD000022 +- **Expected Result**: + - All layers fail + - failure_count reaches threshold (3) + - F11 calls `create_user_input_request()` + - SSE event: `user_input_required` + - Event payload contains: + ```json + { + "type": "user_input_required", + "flight_id": "", + "frame_id": 22, + "failed_frames": [20, 21, 22], + "candidate_tiles": [ + {"tile_id": "xyz", "gps": {"lat": 48.27, "lon": 37.38}, "thumbnail_url": "..."}, + {"tile_id": "abc", "gps": {"lat": 48.26, "lon": 37.37}, "thumbnail_url": "..."}, + {"tile_id": "def", "gps": {"lat": 48.28, "lon": 37.39}, "thumbnail_url": "..."} + ], + "uav_image_url": "/flights//images/22", + "message": "System unable to locate 3 consecutive images. Please provide GPS fix." + } + ``` + +### Step 6: Validate Threshold Behavior +- **Action**: Verify user input NOT requested before 3 failures +- **Expected Result**: + - Review event log: no `user_input_required` before frame 22 + - Threshold is exactly 3 consecutive failures, not 2 or 4 + +### Step 7: User Provides GPS Fix +- **Action**: POST /flights/{flightId}/user-fix +- **Payload**: + ```json + { + "frame_id": 22, + "uav_pixel": [3126, 2084], + "satellite_gps": {"lat": 48.273997, "lon": 37.379828}, + "confidence": "high" + } + ``` +- **Expected Result**: + - HTTP 200 OK + - Response: `{"status": "accepted", "frame_id": 22}` + +### Step 8: System Incorporates User Fix +- **Action**: F11 processes user fix via `apply_user_anchor()` +- **Expected Result**: + - F10 adds GPS anchor with high confidence (weight = 10.0) + - Factor graph re-optimizes + - SSE event: `user_fix_applied` + - Event payload: + ```json + { + "type": "user_fix_applied", + "frame_id": 22, + "estimated_gps": {"lat": 48.273997, "lon": 37.379828}, + "affected_frames": [20, 21, 22] + } + ``` + +### Step 9: Trajectory Refinement +- **Action**: Factor graph back-propagates fix to frames 20, 21 +- **Expected Result**: + - SSE event: `trajectory_refined` for frames 20, 21 + - All 3 failed frames now have GPS estimates + - failure_count reset to 0 + +### Step 10: Processing Resumes Automatically +- **Action**: System processes AD000023 and beyond +- **Expected Result**: + - Processing resumes without manual restart + - AD000023+ processed normally (no more injected failures) + - SSE events continue: `frame_processed` + +### Step 11: Validate 20% Route Allowance +- **Action**: Calculate maximum allowed user inputs for 60-image flight +- **Expected Result**: + - 20% of 60 = 12 images maximum can need user input + - System tracks user_input_count per flight + - If user_input_count > 12, system logs warning but continues + +### Step 12: Test Multiple User Input Cycles +- **Action**: Inject failures for frames AD000040, AD000041, AD000042 +- **Expected Result**: + - Second `user_input_required` event triggered + - User provides second fix + - System continues processing + - Total user inputs: 2 cycles (6 frames aided) + +### Step 13: Test User Input Timeout +- **Action**: Trigger user input request, wait 5 minutes without response +- **Expected Result**: + - System sends reminder: `user_input_reminder` at 2 minutes + - Processing remains paused for affected chunk + - Other chunks (if any) continue processing + - No timeout crash + +### Step 14: Test Invalid User Fix +- **Action**: Submit user fix with invalid GPS (outside geofence) +- **Payload**: + ```json + { + "frame_id": 22, + "satellite_gps": {"lat": 0.0, "lon": 0.0} + } + ``` +- **Expected Result**: + - HTTP 400 Bad Request + - Error: "GPS coordinates outside flight geofence" + - System re-requests user input + +### Step 15: Validate Final Flight Statistics +- **Action**: GET /flights/{flightId}/status +- **Expected Result**: + ```json + { + "flight_id": "", + "total_frames": 60, + "processed_frames": 60, + "user_input_requests": 2, + "user_inputs_provided": 2, + "frames_aided_by_user": 6, + "user_input_percentage": 10.0 + } + ``` ## Success Criteria -- User input requested after 3 consecutive failures (not before) -- User notified via SSE with relevant info -- User fix incorporated with high confidence -- Processing resumes automatically -- Allows up to 20% of route to need user input (12 out of 60 images) + +**Primary Criteria (AC-6)**: +- User input requested after exactly 3 consecutive failures (not 2, not 4) +- User notified via SSE with relevant context (candidate tiles, image URL) +- User fix accepted via REST API +- User fix incorporated as high-confidence GPS anchor +- Processing resumes automatically after fix +- System allows up to 20% of route to need user input + +**Supporting Criteria**: +- SSE events delivered within 1 second +- Factor graph incorporates fix within 2 seconds +- Back-propagation refines earlier failed frames +- failure_count resets after successful fix +- System handles multiple user input cycles per flight ## Pass/Fail Criteria -**Passes If**: User input mechanism works, threshold correct (3 failures), processing resumes -**Fails If**: User input not requested, or system cannot incorporate user fixes +**TEST PASSES IF**: +- User input request triggered at exactly 3 consecutive failures +- SSE event contains all required info (frame_id, candidate tiles) +- User fix accepted and incorporated +- Processing resumes automatically +- 20% allowance calculated correctly +- Multiple cycles work correctly +- Invalid fixes rejected gracefully + +**TEST FAILS IF**: +- User input requested before 3 failures +- User input NOT requested after 3 failures +- SSE event missing required fields +- User fix causes system error +- Processing does not resume after fix +- System crashes on invalid user input +- Timeout causes system hang + +## Error Scenarios + +### Scenario A: User Provides Wrong GPS +- User fix GPS is 500m from actual location +- System accepts fix (user has authority) +- Subsequent frames may fail again +- Second user input cycle may be needed + +### Scenario B: SSE Connection Lost +- Client disconnects during user input wait +- System buffers events +- Client reconnects, receives pending events +- Processing state preserved + +### Scenario C: Database Failure During Fix +- User fix received but DB write fails +- System retries 3 times +- If all retries fail, returns HTTP 503 +- User can retry submission + +## Components Involved +- F01 Flight API: `POST /flights/{id}/user-fix` +- F02.1 Flight Lifecycle Manager: `handle_user_fix()` +- F02.2 Flight Processing Engine: `apply_user_fix()` +- F10 Factor Graph Optimizer: `add_absolute_factor()` with high confidence +- F11 Failure Recovery Coordinator: `create_user_input_request()`, `apply_user_anchor()` +- F15 SSE Event Streamer: `send_user_input_request()`, `send_user_fix_applied()` + +## Notes +- AC-6 is the human-in-the-loop fallback for extreme failures +- 3-failure threshold balances automation with user intervention +- 20% allowance (12 of 60 images) is operational constraint +- User fixes are trusted (high confidence weight in factor graph) +- System should minimize user inputs via L1/L2/L3 layer defense diff --git a/docs/03_tests/47_reprojection_error_spec.md b/docs/03_tests/47_reprojection_error_spec.md index 4816f0d..6bb742c 100644 --- a/docs/03_tests/47_reprojection_error_spec.md +++ b/docs/03_tests/47_reprojection_error_spec.md @@ -6,35 +6,251 @@ Validate Acceptance Criterion 10: "Mean Reprojection Error (MRE) < 1.0 pixels. T ## Linked Acceptance Criteria **AC-10**: MRE < 1.0 pixels +## Preconditions +1. ASTRAL-Next system operational +2. F07 Sequential Visual Odometry extracting and matching features +3. F10 Factor Graph Optimizer computing optimized poses +4. Camera intrinsics calibrated (from F17 Configuration Manager) +5. Test dataset with ground truth poses (for reference) +6. Reprojection error calculation implemented + +## Reprojection Error Definition + +**Formula**: +``` +For each matched feature point p_i in image I_j: + 1. Triangulate 3D point X_i from matches across images + 2. Project X_i back to image I_j using optimized pose T_j and camera K + 3. p'_i = K * T_j * X_i (projected pixel location) + 4. e_i = ||p_i - p'_i|| (Euclidean distance in pixels) + +MRE = (1/N) * Σ e_i (mean across all features in all images) +``` + +## Test Data +- **Dataset**: AD000001-AD000030 (30 images, baseline) +- **Expected Features**: ~500-2000 matched features per image pair +- **Total Measurements**: ~15,000-60,000 reprojection measurements + ## Test Steps -### Step 1: Process Flight with Factor Graph -- **Action**: Process AD000001-030 through complete pipeline -- **Expected Result**: Factor graph optimizes full trajectory +### Step 1: Process Flight Through Complete Pipeline +- **Action**: Process AD000001-AD000030 through full ASTRAL-Next pipeline +- **Expected Result**: + - Factor graph initialized and optimized + - 30 poses computed + - All feature correspondences stored -### Step 2: Calculate Reprojection Errors -- **Action**: For each matched feature across image pairs: - - Project 3D point back to image plane using optimized poses - - Measure pixel distance from original detection -- **Expected Result**: Array of reprojection errors for all features +### Step 2: Extract Feature Correspondences +- **Action**: Retrieve all matched features from F07 +- **Expected Result**: + - For each image pair (i, j): + - List of matched keypoint pairs: [(p_i, p_j), ...] + - Match confidence scores + - Total: ~500-1500 matches per pair + - Total matches across flight: ~15,000-45,000 -### Step 3: Compute Mean Reprojection Error -- **Action**: Calculate mean across all features in all images -- **Expected Result**: MRE < 1.0 pixels +### Step 3: Triangulate 3D Points +- **Action**: For each matched feature across multiple views, triangulate 3D position +- **Expected Result**: + - 3D point cloud generated + - Each point has: + - 3D coordinates (X, Y, Z) in ENU frame + - List of observations (image_id, pixel_location) + - Triangulation uncertainty -### Step 4: Validate Factor Graph Quality -- **Action**: Low MRE indicates: - - Poses geometrically consistent - - 3D structure accurate - - No "tension" in factor graph -- **Expected Result**: MRE correlates with GPS accuracy +### Step 4: Calculate Per-Feature Reprojection Error +- **Action**: For each 3D point and each observation: + ``` + For point X with observation (image_j, pixel_p): + 1. Get optimized pose T_j from factor graph + 2. Get camera intrinsics K from config + 3. Project: p' = project(K, T_j, X) + 4. Error: e = sqrt((p.x - p'.x)² + (p.y - p'.y)²) + ``` +- **Expected Result**: + - Array of per-feature reprojection errors + - Typical range: 0.1 - 3.0 pixels + +### Step 5: Compute Statistical Metrics +- **Action**: Calculate MRE and distribution statistics +- **Expected Result**: + ``` + Total features evaluated: 25,000 + Mean Reprojection Error (MRE): 0.72 pixels + Median Reprojection Error: 0.58 pixels + Standard Deviation: 0.45 pixels + 90th Percentile: 1.25 pixels + 95th Percentile: 1.68 pixels + 99th Percentile: 2.41 pixels + Max Error: 4.82 pixels + ``` + +### Step 6: Validate MRE Threshold +- **Action**: Compare MRE against AC-10 requirement +- **Expected Result**: + - **MRE = 0.72 pixels < 1.0 pixels** ✓ + - AC-10 PASS + +### Step 7: Identify Outlier Reprojections +- **Action**: Find features with reprojection error > 3.0 pixels +- **Expected Result**: + ``` + Outliers (> 3.0 pixels): 127 (0.5% of total) + Outlier distribution: + - 3.0-5.0 pixels: 98 features + - 5.0-10.0 pixels: 27 features + - > 10.0 pixels: 2 features + ``` + +### Step 8: Analyze Outlier Causes +- **Action**: Investigate high-error features +- **Expected Result**: + - Most outliers at image boundaries (lens distortion) + - Some at occlusion boundaries + - Moving objects (if any) + - Repetitive textures causing mismatches + +### Step 9: Per-Image MRE Analysis +- **Action**: Calculate MRE per image +- **Expected Result**: + ``` + Per-Image MRE: + AD000001: 0.68 px (baseline) + AD000002: 0.71 px + ... + AD000032: 1.12 px (sharp turn - higher error) + AD000033: 0.95 px + ... + AD000030: 0.74 px + + Images with MRE > 1.0: 2 out of 30 (6.7%) + Overall MRE: 0.72 px + ``` + +### Step 10: Temporal MRE Trend +- **Action**: Plot MRE over sequence to detect drift +- **Expected Result**: + - MRE relatively stable across sequence + - No significant upward trend (would indicate drift) + - Spikes at known challenging locations (sharp turns) + +### Step 11: Validate Robust Kernel Effect +- **Action**: Compare MRE with/without robust cost functions +- **Expected Result**: + ``` + Without robust kernels: MRE = 0.89 px, outliers affect mean + With Cauchy kernel: MRE = 0.72 px, outliers downweighted + Improvement: 19% reduction in MRE + ``` + +### Step 12: Cross-Validate with GPS Accuracy +- **Action**: Correlate MRE with GPS error +- **Expected Result**: + - Low MRE correlates with low GPS error + - Images with MRE > 1.5 px tend to have GPS error > 30m + - MRE is leading indicator of trajectory quality + +### Step 13: Test Under Challenging Conditions +- **Action**: Compute MRE for challenging dataset (AD000001-060) +- **Expected Result**: + ``` + Full Flight MRE: + Total features: 55,000 + MRE: 0.84 pixels (still < 1.0) + Challenging segments: + - Sharp turns: MRE = 1.15 px (above threshold locally) + - Normal segments: MRE = 0.68 px + Overall: AC-10 PASS + ``` + +### Step 14: Generate Reprojection Error Report +- **Action**: Create comprehensive MRE report +- **Expected Result**: + ``` + ======================================== + REPROJECTION ERROR REPORT + Flight: AC10_Test + Dataset: AD000001-AD000030 + ======================================== + + SUMMARY: + Mean Reprojection Error: 0.72 pixels + AC-10 Threshold: 1.0 pixels + Status: PASS ✓ + + DISTRIBUTION: + < 0.5 px: 12,450 (49.8%) + 0.5-1.0 px: 9,875 (39.5%) + 1.0-2.0 px: 2,350 (9.4%) + 2.0-3.0 px: 198 (0.8%) + > 3.0 px: 127 (0.5%) + + PER-IMAGE BREAKDOWN: + Images meeting < 1.0 px MRE: 28/30 (93.3%) + Images with highest MRE: AD000032 (1.12 px), AD000048 (1.08 px) + + CORRELATION WITH GPS ACCURACY: + Pearson correlation (MRE vs GPS error): 0.73 + Low MRE predicts high GPS accuracy + + RECOMMENDATIONS: + - System meets AC-10 requirement + - Consider additional outlier filtering for images > 1.0 px MRE + - Sharp turn handling could be improved + ======================================== + ``` ## Success Criteria -- Mean Reprojection Error < 1.0 pixels -- Standard deviation reasonable (< 2.0 pixels) -- No outlier reprojections (> 10 pixels) + +**Primary Criterion (AC-10)**: +- Mean Reprojection Error < 1.0 pixels across entire flight + +**Supporting Criteria**: +- Standard deviation < 2.0 pixels +- No outlier reprojections > 10 pixels (indicates gross errors) +- Per-image MRE < 2.0 pixels (no catastrophic single-image failures) +- MRE stable across sequence (no drift) ## Pass/Fail Criteria -**Passes If**: MRE < 1.0 pixels -**Fails If**: MRE ≥ 1.0 pixels, indicating geometry inconsistencies +**TEST PASSES IF**: +- Overall MRE < 1.0 pixels +- Standard deviation reasonable (< 2.0 pixels) +- Less than 1% of features have error > 5.0 pixels +- MRE consistent across multiple test runs (variance < 10%) + +**TEST FAILS IF**: +- MRE ≥ 1.0 pixels +- Standard deviation > 3.0 pixels (high variance indicates instability) +- More than 5% of features have error > 5.0 pixels +- MRE increases significantly over sequence (drift) + +## Diagnostic Actions if Failing + +**If MRE > 1.0 px**: +1. Check camera calibration accuracy +2. Verify lens distortion model +3. Review feature matching quality (outlier ratio) +4. Examine factor graph convergence +5. Check for scale drift in trajectory + +**If High Variance**: +1. Investigate images with outlier MRE +2. Check for challenging conditions (blur, low texture) +3. Review robust kernel settings +4. Verify triangulation accuracy + +## Components Involved +- F07 Sequential Visual Odometry: Feature extraction and matching +- F10 Factor Graph Optimizer: Pose optimization, marginal covariances +- F13 Coordinate Transformer: 3D point projection +- H01 Camera Model: Camera intrinsics, projection functions +- H03 Robust Kernels: Outlier handling in optimization + +## Notes +- MRE is a geometric consistency metric, not direct GPS accuracy +- Low MRE indicates well-constrained factor graph +- High MRE with good GPS accuracy = overfitting to GPS anchors +- Low MRE with poor GPS accuracy = scale/alignment issues +- AC-10 validates internal consistency of vision pipeline diff --git a/docs/03_tests/55_chunk_rotation_recovery_spec.md b/docs/03_tests/55_chunk_rotation_recovery_spec.md index 6bb4000..2c81f73 100644 --- a/docs/03_tests/55_chunk_rotation_recovery_spec.md +++ b/docs/03_tests/55_chunk_rotation_recovery_spec.md @@ -8,11 +8,14 @@ Validate chunk LiteSAM matching with rotation sweeps for chunks with unknown ori **AC-5**: Connect route chunks ## Preconditions -1. F12 Route Chunk Manager functional -2. F06 Image Rotation Manager with chunk rotation support -3. F09 Metric Refinement with chunk LiteSAM matching -4. F10 Factor Graph Optimizer with chunk merging -5. Test dataset: Chunk with unknown orientation (simulated sharp turn) +1. F02.2 Flight Processing Engine running +2. F11 Failure Recovery Coordinator (chunk orchestration, returns status objects) +3. F12 Route Chunk Manager functional (chunk lifecycle via `create_chunk()`, `mark_chunk_anchored()`) +4. F06 Image Rotation Manager with chunk rotation support (`try_chunk_rotation_steps()`) +5. F08 Global Place Recognition (chunk semantic matching via `retrieve_candidate_tiles_for_chunk()`) +6. F09 Metric Refinement with chunk LiteSAM matching (`align_chunk_to_satellite()`) +7. F10 Factor Graph Optimizer with chunk operations (`add_chunk_anchor()`, `merge_chunk_subgraphs()`) +8. Test dataset: Chunk with unknown orientation (simulated sharp turn) ## Test Description Test system's ability to match chunks with unknown orientation using rotation sweeps. When a chunk is created after a sharp turn, its orientation relative to the satellite map is unknown. The system must rotate the entire chunk to all possible angles and attempt LiteSAM matching. @@ -54,8 +57,8 @@ Test system's ability to match chunks with unknown orientation using rotation sw ### Step 4: Chunk Merging - **Action**: Merge chunk_2 to main trajectory - **Expected Result**: - - F10.add_chunk_anchor() anchors chunk_2 - - F10.merge_chunks() merges chunk_2 into chunk_1 + - F12.mark_chunk_anchored() updates chunk state (calls F10.add_chunk_anchor()) + - F12.merge_chunks() merges chunk_2 into chunk_1 (calls F10.merge_chunk_subgraphs()) - Sim(3) transform applied correctly - Global trajectory consistent diff --git a/docs/03_tests/56_multi_chunk_simultaneous_spec.md b/docs/03_tests/56_multi_chunk_simultaneous_spec.md index 3b3103f..3a97434 100644 --- a/docs/03_tests/56_multi_chunk_simultaneous_spec.md +++ b/docs/03_tests/56_multi_chunk_simultaneous_spec.md @@ -8,10 +8,13 @@ Validate system's ability to process multiple chunks simultaneously, matching an **AC-5**: Connect route chunks (multiple chunks) ## Preconditions -1. F10 Factor Graph Optimizer with native multi-chunk support -2. F12 Route Chunk Manager functional -3. F11 Failure Recovery Coordinator with chunk orchestration -4. Test dataset: Flight with 3 disconnected segments +1. F02.2 Flight Processing Engine running +2. F10 Factor Graph Optimizer with native multi-chunk support (subgraph operations) +3. F11 Failure Recovery Coordinator (pure logic, returns status objects to F02.2) +4. F12 Route Chunk Manager functional (chunk lifecycle: `create_chunk()`, `add_frame_to_chunk()`, `mark_chunk_anchored()`, `merge_chunks()`) +5. F08 Global Place Recognition (chunk semantic matching via `retrieve_candidate_tiles_for_chunk()`) +6. F09 Metric Refinement (chunk LiteSAM matching) +7. Test dataset: Flight with 3 disconnected segments ## Test Description Test system's ability to handle multiple disconnected route segments simultaneously. The system should create chunks proactively, process them independently, and match/merge them asynchronously without blocking frame processing. @@ -143,24 +146,26 @@ Multi-Chunk Simultaneous Processing: ## Architecture Elements **Multi-Chunk Support**: -- F10 Factor Graph Optimizer supports multiple chunks simultaneously -- Each chunk has own subgraph -- Chunks optimized independently +- F10 Factor Graph Optimizer supports multiple chunks via `create_chunk_subgraph()` +- Each chunk has own subgraph, optimized independently via `optimize_chunk()` +- F12 Route Chunk Manager owns chunk metadata (status, is_active, etc.) **Proactive Chunk Creation**: -- Chunks created immediately on tracking loss -- Not reactive (doesn't wait for matching to fail) -- Processing continues in new chunk +- F11 triggers chunk creation via `create_chunk_on_tracking_loss()` +- F12.create_chunk() creates chunk and calls F10.create_chunk_subgraph() +- Processing continues in new chunk immediately (not reactive) **Asynchronous Matching**: -- Background task processes unanchored chunks +- F02.2 manages background task that calls F11.process_unanchored_chunks() +- F11 calls F12.get_chunks_for_matching() to find ready chunks +- F11.try_chunk_semantic_matching() → F11.try_chunk_litesam_matching() - Matching doesn't block frame processing -- Chunks matched and merged asynchronously **Chunk Merging**: -- Sim(3) transform for merging -- Accounts for translation, rotation, scale -- Global optimization after merging +- F11.merge_chunk_to_trajectory() coordinates merging +- F12.merge_chunks() updates chunk state and calls F10.merge_chunk_subgraphs() +- Sim(3) transform accounts for translation, rotation, scale +- F10.optimize_global() runs after merging ## Notes - Multiple chunks can exist simultaneously