Files
gps-denied-desktop/docs/03_tests/12_result_manager_integration_spec.md
T
Oleksandr Bezdieniezhnykh 4f8c18a066 add tests
gen_tests updated
solution.md updated
2025-11-24 22:57:46 +02:00

12 KiB

Integration Test: Result Manager

Summary

Validate the Result Manager component responsible for storing, retrieving, and managing GPS localization results for processed images.

Component Under Test

Component: Result Manager Location: gps_denied_13_result_manager Dependencies:

  • Database Layer (result persistence)
  • Factor Graph Optimizer (source of results)
  • Coordinate Transformer
  • SSE Event Streamer (result notifications)

Detailed Description

This test validates that the Result Manager can:

  1. Store initial GPS estimates from vision pipeline
  2. Store refined GPS results after factor graph optimization
  3. Track result versioning (initial vs refined per AC-8)
  4. Retrieve results by flight, image, or time range
  5. Support various output formats (JSON, CSV, KML)
  6. Calculate and store accuracy metrics
  7. Manage result updates when trajectory is re-optimized
  8. Handle user-provided fixes and manual corrections
  9. Export results for external analysis
  10. Maintain result history and audit trail

Per AC-8: "system could refine existing calculated results and send refined results again to user" - Result Manager must track both initial and refined results.

Input Data

Test Case 1: Store Initial Result

  • Flight: Test_Flight_001
  • Image: AD000001.jpg
  • GPS Estimate: 48.275290, 37.385218 (from L3)
  • Ground Truth: 48.275292, 37.385220
  • Metadata: confidence=0.92, processing_time_ms=450, layer="L3"
  • Expected: Result stored with version=1 (initial)

Test Case 2: Store Refined Result

  • Flight: Test_Flight_001
  • Image: AD000001.jpg (same as Test Case 1)
  • GPS Refined: 48.275291, 37.385219 (from Factor Graph)
  • Metadata: confidence=0.95, refinement_reason="new_anchor"
  • Expected: Result stored with version=2 (refined), version=1 preserved

Test Case 3: Batch Store Results

  • Flight: Test_Flight_002
  • Images: AD000001-AD000010
  • Expected: All 10 results stored atomically

Test Case 4: Retrieve Single Result

  • Query: Get result for AD000001.jpg in Test_Flight_001
  • Options: include_all_versions=true
  • Expected: Returns both version=1 (initial) and version=2 (refined)

Test Case 5: Retrieve Flight Results

  • Query: Get all results for Test_Flight_001
  • Options: latest_version_only=true
  • Expected: Returns latest version for each image

Test Case 6: Retrieve with Filtering

  • Query: Get results for Test_Flight_001
  • Filter: confidence > 0.9, error_m < 50
  • Expected: Returns only results matching criteria

Test Case 7: Export to JSON

  • Flight: Test_Flight_001
  • Format: JSON
  • Expected: Valid JSON file with all results

Test Case 8: Export to CSV

  • Flight: Test_Flight_001
  • Format: CSV
  • Columns: image, lat, lon, error_m, confidence
  • Expected: Valid CSV matching coordinates.csv format

Test Case 9: Export to KML

  • Flight: Test_Flight_001
  • Format: KML (for Google Earth)
  • Expected: Valid KML with placemarks for each image

Test Case 10: Store User Fix

  • Flight: Test_Flight_001
  • Image: AD000005.jpg
  • User GPS: 48.273997, 37.379828 (from ground truth)
  • Metadata: source="user", confidence=1.0
  • Expected: User fix stored with special flag, triggers refinement

Test Case 11: Calculate Statistics

  • Flight: Test_Flight_001 with ground truth
  • Calculation: Compare estimated vs ground truth
  • Expected Statistics:
    • mean_error_m
    • median_error_m
    • rmse_m
    • percent_under_50m
    • percent_under_20m
    • max_error_m
    • registration_rate

Test Case 12: Result History

  • Query: Get history for AD000001.jpg
  • Expected: Returns timeline of all versions with timestamps

Expected Output

For each test case:

{
  "result_id": "unique_result_identifier",
  "flight_id": "flight_123",
  "image_id": "AD000001",
  "sequence_number": 1,
  "version": 1,
  "estimated_gps": {
    "lat": 48.275290,
    "lon": 37.385218,
    "altitude_m": 400
  },
  "ground_truth_gps": {
    "lat": 48.275292,
    "lon": 37.385220
  },
  "error_m": 0.25,
  "confidence": 0.92,
  "source": "L3|factor_graph|user",
  "processing_time_ms": 450,
  "metadata": {
    "layer": "L3",
    "num_features": 250,
    "inlier_ratio": 0.85
  },
  "created_at": "timestamp",
  "refinement_reason": "string|null"
}

Success Criteria

Test Case 1 (Store Initial):

  • Result stored successfully
  • version = 1
  • All fields present and valid
  • Timestamp recorded

Test Case 2 (Store Refined):

  • version = 2 stored
  • version = 1 preserved
  • Reference between versions maintained
  • Refinement reason recorded

Test Case 3 (Batch Store):

  • All 10 results stored
  • Transaction atomic (all or nothing)
  • Processing time < 1 second

Test Case 4 (Retrieve Single):

  • Both versions returned
  • Ordered by version number
  • All fields complete

Test Case 5 (Retrieve Flight):

  • All images returned
  • Only latest versions
  • Ordered by sequence_number

Test Case 6 (Filtering):

  • Only matching results returned
  • Filter applied correctly
  • Query time < 500ms

Test Case 7 (Export JSON):

  • Valid JSON file
  • All results included
  • Human-readable formatting

Test Case 8 (Export CSV):

  • Valid CSV file
  • Matches coordinates.csv format
  • Can be opened in Excel/spreadsheet
  • No missing values

Test Case 9 (Export KML):

  • Valid KML (validates against schema)
  • Displays correctly in Google Earth
  • Includes image names and metadata

Test Case 10 (User Fix):

  • User fix stored with source="user"
  • confidence = 1.0 (user fixes are absolute)
  • Triggers trajectory refinement notification

Test Case 11 (Statistics):

  • All statistics calculated correctly
  • Match manual calculations
  • Accuracy targets validated (AC-1, AC-2)
  • Registration rate validated (AC-9)

Test Case 12 (History):

  • Complete timeline returned
  • All versions present
  • Chronological order
  • Includes metadata for each version

Maximum Expected Time

  • Store single result: < 100ms
  • Store batch (10 results): < 1 second
  • Retrieve single: < 100ms
  • Retrieve flight (60 results): < 500ms
  • Export to file (60 results): < 2 seconds
  • Calculate statistics: < 1 second
  • Total test suite: < 30 seconds

Test Execution Steps

  1. Setup Phase: a. Initialize Result Manager b. Create test flight c. Prepare test result data d. Load ground truth for comparison

  2. Test Case 1 - Store Initial: a. Call store_result() with initial estimate b. Verify database insertion c. Check all fields stored d. Validate timestamp

  3. Test Case 2 - Store Refined: a. Call store_result() with refined estimate b. Verify version increment c. Check version=1 still exists d. Validate refinement metadata

  4. Test Case 3 - Batch Store: a. Prepare 10 results b. Call store_results_batch() c. Verify transaction atomicity d. Check all stored correctly

  5. Test Case 4 - Retrieve Single: a. Call get_result() with image_id b. Request all versions c. Verify both returned d. Check ordering

  6. Test Case 5 - Retrieve Flight: a. Call get_flight_results() with flight_id b. Request latest only c. Verify all images present d. Check ordering by sequence

  7. Test Case 6 - Filtering: a. Call get_flight_results() with filters b. Verify filter application c. Validate query performance d. Check result correctness

  8. Test Case 7 - Export JSON: a. Call export_results(format="json") b. Write to file c. Validate JSON syntax d. Verify completeness

  9. Test Case 8 - Export CSV: a. Call export_results(format="csv") b. Write to file c. Validate CSV format d. Compare with ground truth CSV

  10. Test Case 9 - Export KML: a. Call export_results(format="kml") b. Write to file c. Validate KML schema d. Test in Google Earth if available

  11. Test Case 10 - User Fix: a. Call store_user_fix() b. Verify special handling c. Check refinement triggered d. Validate confidence=1.0

  12. Test Case 11 - Statistics: a. Call calculate_statistics() b. Compare with ground truth c. Verify all metrics calculated d. Check against AC-1, AC-2, AC-9

  13. Test Case 12 - History: a. Call get_result_history() b. Verify all versions returned c. Check chronological order d. Validate metadata completeness

Pass/Fail Criteria

Overall Test Passes If:

  • All 12 test cases meet their success criteria
  • No data loss
  • All versions tracked correctly
  • Exports generate valid files
  • Statistics calculated accurately
  • Query performance acceptable

Test Fails If:

  • Any result fails to store
  • Versions overwrite each other
  • Data corruption occurs
  • Exports invalid or incomplete
  • Statistics incorrect
  • Query times exceed 2x maximum

Additional Validation

Data Integrity:

  • Foreign key constraints enforced (flight_id, image_id)
  • No orphaned results
  • Cascading deletes handled correctly
  • Transaction isolation prevents dirty reads

Versioning Logic:

  • Version numbers increment sequentially
  • No gaps in version sequence
  • Latest version easily identifiable
  • Historical versions immutable

Export Formats:

JSON Format:

{
  "flight_id": "flight_123",
  "flight_name": "Test Flight",
  "total_images": 60,
  "results": [
    {
      "image": "AD000001.jpg",
      "sequence": 1,
      "gps": {"lat": 48.275292, "lon": 37.385220},
      "error_m": 0.25,
      "confidence": 0.92
    }
  ]
}

CSV Format:

image,sequence,lat,lon,altitude_m,error_m,confidence,source
AD000001.jpg,1,48.275292,37.385220,400,0.25,0.92,factor_graph

KML Format:

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
  <Document>
    <Placemark>
      <name>AD000001.jpg</name>
      <Point>
        <coordinates>37.385220,48.275292,400</coordinates>
      </Point>
    </Placemark>
  </Document>
</kml>

Statistics Calculations:

Verify formulas:

  • Error: haversine_distance(estimated, ground_truth)
  • RMSE: sqrt(mean(errors^2))
  • Percent < X: count(errors < X) / total * 100
  • Registration Rate: processed_images / total_images * 100

Accuracy Validation Against ACs:

  • AC-1: percent_under_50m ≥ 80%
  • AC-2: percent_under_20m ≥ 60%
  • AC-9: registration_rate > 95%

Performance Optimization:

  • Database indexing on flight_id, image_id, sequence_number
  • Caching frequently accessed results
  • Batch operations for bulk inserts
  • Pagination for large result sets

Concurrent Access:

  • Multiple clients reading same results
  • Concurrent writes to different flights
  • Optimistic locking for updates
  • No deadlocks

Error Handling:

  • Invalid flight_id: reject with clear error
  • Duplicate result: update vs reject (configurable)
  • Missing ground truth: statistics gracefully handle nulls
  • Export to invalid path: fail with clear error
  • Database connection failure: retry logic

Audit Trail:

  • Who created/modified result
  • When each version created
  • Why refinement occurred
  • Source of each estimate (L1/L2/L3/FG/user)

Data Retention:

  • Configurable retention policy
  • Archival of old results
  • Purging of test data
  • Backup and recovery procedures

Integration with AC-8: Verify "refine existing calculated results" functionality:

  1. Initial result stored immediately after L3 processing
  2. Refined result stored after factor graph optimization
  3. Both versions maintained
  4. Client notified via SSE when refinement occurs
  5. Latest version available via API

Query Capabilities:

  • Get results by flight
  • Get results by time range
  • Get results by accuracy (error range)
  • Get results by confidence threshold
  • Get results by source (L3, factor_graph, user)
  • Get results needing refinement
  • Get results with errors

Memory Management:

  • Results not kept in memory unnecessarily
  • Large result sets streamed not loaded entirely
  • Export operations use streaming writes
  • No memory leaks on repeated queries