add tests

gen_tests updated solution.md updated
2026-04-23 01:46:38 +00:00 · 2025-11-24 22:57:46 +02:00
parent f50006d100
commit 4f8c18a066
49 changed files with 7209 additions and 3 deletions
@@ -0,0 +1,434 @@
+# Integration Test: Result Manager
+
+## Summary
+Validate the Result Manager component responsible for storing, retrieving, and managing GPS localization results for processed images.
+
+## Component Under Test
+**Component**: Result Manager
+**Location**: `gps_denied_13_result_manager`
+**Dependencies**:
+- Database Layer (result persistence)
+- Factor Graph Optimizer (source of results)
+- Coordinate Transformer
+- SSE Event Streamer (result notifications)
+
+## Detailed Description
+This test validates that the Result Manager can:
+1. Store initial GPS estimates from vision pipeline
+2. Store refined GPS results after factor graph optimization
+3. Track result versioning (initial vs refined per AC-8)
+4. Retrieve results by flight, image, or time range
+5. Support various output formats (JSON, CSV, KML)
+6. Calculate and store accuracy metrics
+7. Manage result updates when trajectory is re-optimized
+8. Handle user-provided fixes and manual corrections
+9. Export results for external analysis
+10. Maintain result history and audit trail
+
+Per AC-8: "system could refine existing calculated results and send refined results again to user" - Result Manager must track both initial and refined results.
+
+## Input Data
+
+### Test Case 1: Store Initial Result
+- **Flight**: Test_Flight_001
+- **Image**: AD000001.jpg
+- **GPS Estimate**: 48.275290, 37.385218 (from L3)
+- **Ground Truth**: 48.275292, 37.385220
+- **Metadata**: confidence=0.92, processing_time_ms=450, layer="L3"
+- **Expected**: Result stored with version=1 (initial)
+
+### Test Case 2: Store Refined Result
+- **Flight**: Test_Flight_001
+- **Image**: AD000001.jpg (same as Test Case 1)
+- **GPS Refined**: 48.275291, 37.385219 (from Factor Graph)
+- **Metadata**: confidence=0.95, refinement_reason="new_anchor"
+- **Expected**: Result stored with version=2 (refined), version=1 preserved
+
+### Test Case 3: Batch Store Results
+- **Flight**: Test_Flight_002
+- **Images**: AD000001-AD000010
+- **Expected**: All 10 results stored atomically
+
+### Test Case 4: Retrieve Single Result
+- **Query**: Get result for AD000001.jpg in Test_Flight_001
+- **Options**: include_all_versions=true
+- **Expected**: Returns both version=1 (initial) and version=2 (refined)
+
+### Test Case 5: Retrieve Flight Results
+- **Query**: Get all results for Test_Flight_001
+- **Options**: latest_version_only=true
+- **Expected**: Returns latest version for each image
+
+### Test Case 6: Retrieve with Filtering
+- **Query**: Get results for Test_Flight_001
+- **Filter**: confidence > 0.9, error_m < 50
+- **Expected**: Returns only results matching criteria
+
+### Test Case 7: Export to JSON
+- **Flight**: Test_Flight_001
+- **Format**: JSON
+- **Expected**: Valid JSON file with all results
+
+### Test Case 8: Export to CSV
+- **Flight**: Test_Flight_001
+- **Format**: CSV
+- **Columns**: image, lat, lon, error_m, confidence
+- **Expected**: Valid CSV matching coordinates.csv format
+
+### Test Case 9: Export to KML
+- **Flight**: Test_Flight_001
+- **Format**: KML (for Google Earth)
+- **Expected**: Valid KML with placemarks for each image
+
+### Test Case 10: Store User Fix
+- **Flight**: Test_Flight_001
+- **Image**: AD000005.jpg
+- **User GPS**: 48.273997, 37.379828 (from ground truth)
+- **Metadata**: source="user", confidence=1.0
+- **Expected**: User fix stored with special flag, triggers refinement
+
+### Test Case 11: Calculate Statistics
+- **Flight**: Test_Flight_001 with ground truth
+- **Calculation**: Compare estimated vs ground truth
+- **Expected Statistics**:
+  - mean_error_m
+  - median_error_m
+  - rmse_m
+  - percent_under_50m
+  - percent_under_20m
+  - max_error_m
+  - registration_rate
+
+### Test Case 12: Result History
+- **Query**: Get history for AD000001.jpg
+- **Expected**: Returns timeline of all versions with timestamps
+
+## Expected Output
+
+For each test case:
+```json
+{
+  "result_id": "unique_result_identifier",
+  "flight_id": "flight_123",
+  "image_id": "AD000001",
+  "sequence_number": 1,
+  "version": 1,
+  "estimated_gps": {
+    "lat": 48.275290,
+    "lon": 37.385218,
+    "altitude_m": 400
+  },
+  "ground_truth_gps": {
+    "lat": 48.275292,
+    "lon": 37.385220
+  },
+  "error_m": 0.25,
+  "confidence": 0.92,
+  "source": "L3|factor_graph|user",
+  "processing_time_ms": 450,
+  "metadata": {
+    "layer": "L3",
+    "num_features": 250,
+    "inlier_ratio": 0.85
+  },
+  "created_at": "timestamp",
+  "refinement_reason": "string|null"
+}
+```
+
+## Success Criteria
+
+**Test Case 1 (Store Initial)**:
+- Result stored successfully
+- version = 1
+- All fields present and valid
+- Timestamp recorded
+
+**Test Case 2 (Store Refined)**:
+- version = 2 stored
+- version = 1 preserved
+- Reference between versions maintained
+- Refinement reason recorded
+
+**Test Case 3 (Batch Store)**:
+- All 10 results stored
+- Transaction atomic (all or nothing)
+- Processing time < 1 second
+
+**Test Case 4 (Retrieve Single)**:
+- Both versions returned
+- Ordered by version number
+- All fields complete
+
+**Test Case 5 (Retrieve Flight)**:
+- All images returned
+- Only latest versions
+- Ordered by sequence_number
+
+**Test Case 6 (Filtering)**:
+- Only matching results returned
+- Filter applied correctly
+- Query time < 500ms
+
+**Test Case 7 (Export JSON)**:
+- Valid JSON file
+- All results included
+- Human-readable formatting
+
+**Test Case 8 (Export CSV)**:
+- Valid CSV file
+- Matches coordinates.csv format
+- Can be opened in Excel/spreadsheet
+- No missing values
+
+**Test Case 9 (Export KML)**:
+- Valid KML (validates against schema)
+- Displays correctly in Google Earth
+- Includes image names and metadata
+
+**Test Case 10 (User Fix)**:
+- User fix stored with source="user"
+- confidence = 1.0 (user fixes are absolute)
+- Triggers trajectory refinement notification
+
+**Test Case 11 (Statistics)**:
+- All statistics calculated correctly
+- Match manual calculations
+- Accuracy targets validated (AC-1, AC-2)
+- Registration rate validated (AC-9)
+
+**Test Case 12 (History)**:
+- Complete timeline returned
+- All versions present
+- Chronological order
+- Includes metadata for each version
+
+## Maximum Expected Time
+- **Store single result**: < 100ms
+- **Store batch (10 results)**: < 1 second
+- **Retrieve single**: < 100ms
+- **Retrieve flight (60 results)**: < 500ms
+- **Export to file (60 results)**: < 2 seconds
+- **Calculate statistics**: < 1 second
+- **Total test suite**: < 30 seconds
+
+## Test Execution Steps
+
+1. **Setup Phase**:
+   a. Initialize Result Manager
+   b. Create test flight
+   c. Prepare test result data
+   d. Load ground truth for comparison
+
+2. **Test Case 1 - Store Initial**:
+   a. Call store_result() with initial estimate
+   b. Verify database insertion
+   c. Check all fields stored
+   d. Validate timestamp
+
+3. **Test Case 2 - Store Refined**:
+   a. Call store_result() with refined estimate
+   b. Verify version increment
+   c. Check version=1 still exists
+   d. Validate refinement metadata
+
+4. **Test Case 3 - Batch Store**:
+   a. Prepare 10 results
+   b. Call store_results_batch()
+   c. Verify transaction atomicity
+   d. Check all stored correctly
+
+5. **Test Case 4 - Retrieve Single**:
+   a. Call get_result() with image_id
+   b. Request all versions
+   c. Verify both returned
+   d. Check ordering
+
+6. **Test Case 5 - Retrieve Flight**:
+   a. Call get_flight_results() with flight_id
+   b. Request latest only
+   c. Verify all images present
+   d. Check ordering by sequence
+
+7. **Test Case 6 - Filtering**:
+   a. Call get_flight_results() with filters
+   b. Verify filter application
+   c. Validate query performance
+   d. Check result correctness
+
+8. **Test Case 7 - Export JSON**:
+   a. Call export_results(format="json")
+   b. Write to file
+   c. Validate JSON syntax
+   d. Verify completeness
+
+9. **Test Case 8 - Export CSV**:
+   a. Call export_results(format="csv")
+   b. Write to file
+   c. Validate CSV format
+   d. Compare with ground truth CSV
+
+10. **Test Case 9 - Export KML**:
+    a. Call export_results(format="kml")
+    b. Write to file
+    c. Validate KML schema
+    d. Test in Google Earth if available
+
+11. **Test Case 10 - User Fix**:
+    a. Call store_user_fix()
+    b. Verify special handling
+    c. Check refinement triggered
+    d. Validate confidence=1.0
+
+12. **Test Case 11 - Statistics**:
+    a. Call calculate_statistics()
+    b. Compare with ground truth
+    c. Verify all metrics calculated
+    d. Check against AC-1, AC-2, AC-9
+
+13. **Test Case 12 - History**:
+    a. Call get_result_history()
+    b. Verify all versions returned
+    c. Check chronological order
+    d. Validate metadata completeness
+
+## Pass/Fail Criteria
+
+**Overall Test Passes If**:
+- All 12 test cases meet their success criteria
+- No data loss
+- All versions tracked correctly
+- Exports generate valid files
+- Statistics calculated accurately
+- Query performance acceptable
+
+**Test Fails If**:
+- Any result fails to store
+- Versions overwrite each other
+- Data corruption occurs
+- Exports invalid or incomplete
+- Statistics incorrect
+- Query times exceed 2x maximum
+
+## Additional Validation
+
+**Data Integrity**:
+- Foreign key constraints enforced (flight_id, image_id)
+- No orphaned results
+- Cascading deletes handled correctly
+- Transaction isolation prevents dirty reads
+
+**Versioning Logic**:
+- Version numbers increment sequentially
+- No gaps in version sequence
+- Latest version easily identifiable
+- Historical versions immutable
+
+**Export Formats**:
+
+**JSON Format**:
+```json
+{
+  "flight_id": "flight_123",
+  "flight_name": "Test Flight",
+  "total_images": 60,
+  "results": [
+    {
+      "image": "AD000001.jpg",
+      "sequence": 1,
+      "gps": {"lat": 48.275292, "lon": 37.385220},
+      "error_m": 0.25,
+      "confidence": 0.92
+    }
+  ]
+}
+```
+
+**CSV Format**:
+```
+image,sequence,lat,lon,altitude_m,error_m,confidence,source
+AD000001.jpg,1,48.275292,37.385220,400,0.25,0.92,factor_graph
+```
+
+**KML Format**:
+```xml
+<?xml version="1.0" encoding="UTF-8"?>
+<kml xmlns="http://www.opengis.net/kml/2.2">
+  <Document>
+    <Placemark>
+      <name>AD000001.jpg</name>
+      <Point>
+        <coordinates>37.385220,48.275292,400</coordinates>
+      </Point>
+    </Placemark>
+  </Document>
+</kml>
+```
+
+**Statistics Calculations**:
+
+Verify formulas:
+- **Error**: `haversine_distance(estimated, ground_truth)`
+- **RMSE**: `sqrt(mean(errors^2))`
+- **Percent < X**: `count(errors < X) / total * 100`
+- **Registration Rate**: `processed_images / total_images * 100`
+
+**Accuracy Validation Against ACs**:
+- **AC-1**: percent_under_50m ≥ 80%
+- **AC-2**: percent_under_20m ≥ 60%
+- **AC-9**: registration_rate > 95%
+
+**Performance Optimization**:
+- Database indexing on flight_id, image_id, sequence_number
+- Caching frequently accessed results
+- Batch operations for bulk inserts
+- Pagination for large result sets
+
+**Concurrent Access**:
+- Multiple clients reading same results
+- Concurrent writes to different flights
+- Optimistic locking for updates
+- No deadlocks
+
+**Error Handling**:
+- Invalid flight_id: reject with clear error
+- Duplicate result: update vs reject (configurable)
+- Missing ground truth: statistics gracefully handle nulls
+- Export to invalid path: fail with clear error
+- Database connection failure: retry logic
+
+**Audit Trail**:
+- Who created/modified result
+- When each version created
+- Why refinement occurred
+- Source of each estimate (L1/L2/L3/FG/user)
+
+**Data Retention**:
+- Configurable retention policy
+- Archival of old results
+- Purging of test data
+- Backup and recovery procedures
+
+**Integration with AC-8**:
+Verify "refine existing calculated results" functionality:
+1. Initial result stored immediately after L3 processing
+2. Refined result stored after factor graph optimization
+3. Both versions maintained
+4. Client notified via SSE when refinement occurs
+5. Latest version available via API
+
+**Query Capabilities**:
+- Get results by flight
+- Get results by time range
+- Get results by accuracy (error range)
+- Get results by confidence threshold
+- Get results by source (L3, factor_graph, user)
+- Get results needing refinement
+- Get results with errors
+
+**Memory Management**:
+- Results not kept in memory unnecessarily
+- Large result sets streamed not loaded entirely
+- Export operations use streaming writes
+- No memory leaks on repeated queries
+