gen_tests updated solution.md updated
12 KiB
Integration Test: Result Manager
Summary
Validate the Result Manager component responsible for storing, retrieving, and managing GPS localization results for processed images.
Component Under Test
Component: Result Manager
Location: gps_denied_13_result_manager
Dependencies:
- Database Layer (result persistence)
- Factor Graph Optimizer (source of results)
- Coordinate Transformer
- SSE Event Streamer (result notifications)
Detailed Description
This test validates that the Result Manager can:
- Store initial GPS estimates from vision pipeline
- Store refined GPS results after factor graph optimization
- Track result versioning (initial vs refined per AC-8)
- Retrieve results by flight, image, or time range
- Support various output formats (JSON, CSV, KML)
- Calculate and store accuracy metrics
- Manage result updates when trajectory is re-optimized
- Handle user-provided fixes and manual corrections
- Export results for external analysis
- Maintain result history and audit trail
Per AC-8: "system could refine existing calculated results and send refined results again to user" - Result Manager must track both initial and refined results.
Input Data
Test Case 1: Store Initial Result
- Flight: Test_Flight_001
- Image: AD000001.jpg
- GPS Estimate: 48.275290, 37.385218 (from L3)
- Ground Truth: 48.275292, 37.385220
- Metadata: confidence=0.92, processing_time_ms=450, layer="L3"
- Expected: Result stored with version=1 (initial)
Test Case 2: Store Refined Result
- Flight: Test_Flight_001
- Image: AD000001.jpg (same as Test Case 1)
- GPS Refined: 48.275291, 37.385219 (from Factor Graph)
- Metadata: confidence=0.95, refinement_reason="new_anchor"
- Expected: Result stored with version=2 (refined), version=1 preserved
Test Case 3: Batch Store Results
- Flight: Test_Flight_002
- Images: AD000001-AD000010
- Expected: All 10 results stored atomically
Test Case 4: Retrieve Single Result
- Query: Get result for AD000001.jpg in Test_Flight_001
- Options: include_all_versions=true
- Expected: Returns both version=1 (initial) and version=2 (refined)
Test Case 5: Retrieve Flight Results
- Query: Get all results for Test_Flight_001
- Options: latest_version_only=true
- Expected: Returns latest version for each image
Test Case 6: Retrieve with Filtering
- Query: Get results for Test_Flight_001
- Filter: confidence > 0.9, error_m < 50
- Expected: Returns only results matching criteria
Test Case 7: Export to JSON
- Flight: Test_Flight_001
- Format: JSON
- Expected: Valid JSON file with all results
Test Case 8: Export to CSV
- Flight: Test_Flight_001
- Format: CSV
- Columns: image, lat, lon, error_m, confidence
- Expected: Valid CSV matching coordinates.csv format
Test Case 9: Export to KML
- Flight: Test_Flight_001
- Format: KML (for Google Earth)
- Expected: Valid KML with placemarks for each image
Test Case 10: Store User Fix
- Flight: Test_Flight_001
- Image: AD000005.jpg
- User GPS: 48.273997, 37.379828 (from ground truth)
- Metadata: source="user", confidence=1.0
- Expected: User fix stored with special flag, triggers refinement
Test Case 11: Calculate Statistics
- Flight: Test_Flight_001 with ground truth
- Calculation: Compare estimated vs ground truth
- Expected Statistics:
- mean_error_m
- median_error_m
- rmse_m
- percent_under_50m
- percent_under_20m
- max_error_m
- registration_rate
Test Case 12: Result History
- Query: Get history for AD000001.jpg
- Expected: Returns timeline of all versions with timestamps
Expected Output
For each test case:
{
"result_id": "unique_result_identifier",
"flight_id": "flight_123",
"image_id": "AD000001",
"sequence_number": 1,
"version": 1,
"estimated_gps": {
"lat": 48.275290,
"lon": 37.385218,
"altitude_m": 400
},
"ground_truth_gps": {
"lat": 48.275292,
"lon": 37.385220
},
"error_m": 0.25,
"confidence": 0.92,
"source": "L3|factor_graph|user",
"processing_time_ms": 450,
"metadata": {
"layer": "L3",
"num_features": 250,
"inlier_ratio": 0.85
},
"created_at": "timestamp",
"refinement_reason": "string|null"
}
Success Criteria
Test Case 1 (Store Initial):
- Result stored successfully
- version = 1
- All fields present and valid
- Timestamp recorded
Test Case 2 (Store Refined):
- version = 2 stored
- version = 1 preserved
- Reference between versions maintained
- Refinement reason recorded
Test Case 3 (Batch Store):
- All 10 results stored
- Transaction atomic (all or nothing)
- Processing time < 1 second
Test Case 4 (Retrieve Single):
- Both versions returned
- Ordered by version number
- All fields complete
Test Case 5 (Retrieve Flight):
- All images returned
- Only latest versions
- Ordered by sequence_number
Test Case 6 (Filtering):
- Only matching results returned
- Filter applied correctly
- Query time < 500ms
Test Case 7 (Export JSON):
- Valid JSON file
- All results included
- Human-readable formatting
Test Case 8 (Export CSV):
- Valid CSV file
- Matches coordinates.csv format
- Can be opened in Excel/spreadsheet
- No missing values
Test Case 9 (Export KML):
- Valid KML (validates against schema)
- Displays correctly in Google Earth
- Includes image names and metadata
Test Case 10 (User Fix):
- User fix stored with source="user"
- confidence = 1.0 (user fixes are absolute)
- Triggers trajectory refinement notification
Test Case 11 (Statistics):
- All statistics calculated correctly
- Match manual calculations
- Accuracy targets validated (AC-1, AC-2)
- Registration rate validated (AC-9)
Test Case 12 (History):
- Complete timeline returned
- All versions present
- Chronological order
- Includes metadata for each version
Maximum Expected Time
- Store single result: < 100ms
- Store batch (10 results): < 1 second
- Retrieve single: < 100ms
- Retrieve flight (60 results): < 500ms
- Export to file (60 results): < 2 seconds
- Calculate statistics: < 1 second
- Total test suite: < 30 seconds
Test Execution Steps
-
Setup Phase: a. Initialize Result Manager b. Create test flight c. Prepare test result data d. Load ground truth for comparison
-
Test Case 1 - Store Initial: a. Call store_result() with initial estimate b. Verify database insertion c. Check all fields stored d. Validate timestamp
-
Test Case 2 - Store Refined: a. Call store_result() with refined estimate b. Verify version increment c. Check version=1 still exists d. Validate refinement metadata
-
Test Case 3 - Batch Store: a. Prepare 10 results b. Call store_results_batch() c. Verify transaction atomicity d. Check all stored correctly
-
Test Case 4 - Retrieve Single: a. Call get_result() with image_id b. Request all versions c. Verify both returned d. Check ordering
-
Test Case 5 - Retrieve Flight: a. Call get_flight_results() with flight_id b. Request latest only c. Verify all images present d. Check ordering by sequence
-
Test Case 6 - Filtering: a. Call get_flight_results() with filters b. Verify filter application c. Validate query performance d. Check result correctness
-
Test Case 7 - Export JSON: a. Call export_results(format="json") b. Write to file c. Validate JSON syntax d. Verify completeness
-
Test Case 8 - Export CSV: a. Call export_results(format="csv") b. Write to file c. Validate CSV format d. Compare with ground truth CSV
-
Test Case 9 - Export KML: a. Call export_results(format="kml") b. Write to file c. Validate KML schema d. Test in Google Earth if available
-
Test Case 10 - User Fix: a. Call store_user_fix() b. Verify special handling c. Check refinement triggered d. Validate confidence=1.0
-
Test Case 11 - Statistics: a. Call calculate_statistics() b. Compare with ground truth c. Verify all metrics calculated d. Check against AC-1, AC-2, AC-9
-
Test Case 12 - History: a. Call get_result_history() b. Verify all versions returned c. Check chronological order d. Validate metadata completeness
Pass/Fail Criteria
Overall Test Passes If:
- All 12 test cases meet their success criteria
- No data loss
- All versions tracked correctly
- Exports generate valid files
- Statistics calculated accurately
- Query performance acceptable
Test Fails If:
- Any result fails to store
- Versions overwrite each other
- Data corruption occurs
- Exports invalid or incomplete
- Statistics incorrect
- Query times exceed 2x maximum
Additional Validation
Data Integrity:
- Foreign key constraints enforced (flight_id, image_id)
- No orphaned results
- Cascading deletes handled correctly
- Transaction isolation prevents dirty reads
Versioning Logic:
- Version numbers increment sequentially
- No gaps in version sequence
- Latest version easily identifiable
- Historical versions immutable
Export Formats:
JSON Format:
{
"flight_id": "flight_123",
"flight_name": "Test Flight",
"total_images": 60,
"results": [
{
"image": "AD000001.jpg",
"sequence": 1,
"gps": {"lat": 48.275292, "lon": 37.385220},
"error_m": 0.25,
"confidence": 0.92
}
]
}
CSV Format:
image,sequence,lat,lon,altitude_m,error_m,confidence,source
AD000001.jpg,1,48.275292,37.385220,400,0.25,0.92,factor_graph
KML Format:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document>
<Placemark>
<name>AD000001.jpg</name>
<Point>
<coordinates>37.385220,48.275292,400</coordinates>
</Point>
</Placemark>
</Document>
</kml>
Statistics Calculations:
Verify formulas:
- Error:
haversine_distance(estimated, ground_truth) - RMSE:
sqrt(mean(errors^2)) - Percent < X:
count(errors < X) / total * 100 - Registration Rate:
processed_images / total_images * 100
Accuracy Validation Against ACs:
- AC-1: percent_under_50m ≥ 80%
- AC-2: percent_under_20m ≥ 60%
- AC-9: registration_rate > 95%
Performance Optimization:
- Database indexing on flight_id, image_id, sequence_number
- Caching frequently accessed results
- Batch operations for bulk inserts
- Pagination for large result sets
Concurrent Access:
- Multiple clients reading same results
- Concurrent writes to different flights
- Optimistic locking for updates
- No deadlocks
Error Handling:
- Invalid flight_id: reject with clear error
- Duplicate result: update vs reject (configurable)
- Missing ground truth: statistics gracefully handle nulls
- Export to invalid path: fail with clear error
- Database connection failure: retry logic
Audit Trail:
- Who created/modified result
- When each version created
- Why refinement occurred
- Source of each estimate (L1/L2/L3/FG/user)
Data Retention:
- Configurable retention policy
- Archival of old results
- Purging of test data
- Backup and recovery procedures
Integration with AC-8: Verify "refine existing calculated results" functionality:
- Initial result stored immediately after L3 processing
- Refined result stored after factor graph optimization
- Both versions maintained
- Client notified via SSE when refinement occurs
- Latest version available via API
Query Capabilities:
- Get results by flight
- Get results by time range
- Get results by accuracy (error range)
- Get results by confidence threshold
- Get results by source (L3, factor_graph, user)
- Get results needing refinement
- Get results with errors
Memory Management:
- Results not kept in memory unnecessarily
- Large result sets streamed not loaded entirely
- Export operations use streaming writes
- No memory leaks on repeated queries