add solution drafts 3 times, used research skill, expand acceptance criteria

2026-06-21 09:11:12 +00:00 · 2026-03-14 20:38:00 +02:00
parent 767874cb90
commit d764250f9a
23 changed files with 3385 additions and 1 deletions
@@ -0,0 +1,76 @@
+# Acceptance Criteria Assessment
+
+## System Parameters (Calculated)
+
+| Parameter | Value |
+|-----------|-------|
+| GSD (at 400m) | 6.01 cm/pixel |
+| Ground footprint | 376m × 250m |
+| Consecutive overlap | 60-73% (at 100m intervals) |
+| Pixels per 50m | ~832 pixels |
+| Pixels per 20m | ~333 pixels |
+
+## Acceptance Criteria
+
+| Criterion | Our Values | Researched Values | Cost/Timeline Impact | Status |
+|-----------|-----------|-------------------|---------------------|--------|
+| GPS accuracy: 80% within 50m | 50m error for 80% of photos | NaviLoc: 19.5m MLE at 50-150m alt. Mateos-Ramirez: 143m mean at >1000m alt (with IMU). At 400m with 26MP + satellite correction, 50m for 80% is achievable with VO+SIM. No IMU adds ~30-50% error overhead. | Medium cost — needs robust satellite matching pipeline. ~3-4 weeks for core pipeline. | **Achievable** — keep as-is |
+| GPS accuracy: 60% within 20m | 20m error for 60% of photos | NaviLoc: 19.5m MLE at lower altitude (50-150m). At 400m, larger viewpoint gap increases error. Cross-view matching MA@20m improving +10% yearly. Needs high-quality satellite imagery and robust matching. | Higher cost — requires higher-quality satellite imagery (0.3-0.5m resolution). Additional 1-2 weeks for refinement. | **Challenging but achievable** — consider relaxing to 30m initially, tighten with iteration |
+| Handle 350m outlier photos | Tolerate up to 350m jump between consecutive photos | Standard VO systems detect outliers via feature matching failure. 350m at GSD 6cm = ~5833 pixels. Satellite re-localization can handle this if area is textured. | Low additional cost — outlier detection is standard in VO pipelines. | **Achievable** — keep as-is |
+| Sharp turns: <5% overlap, <200m drift, <70° angle | System continues working during sharp turns | <5% overlap means consecutive feature matching will fail. Must fall back to satellite matching for absolute position. At 400m altitude with 376m footprint, 200m drift means partial overlap with satellite. 70° rotation is large but manageable with rotation-invariant matchers (AKAZE, SuperPoint). | High complexity — requires multi-strategy architecture (VO primary, satellite fallback). +2-3 weeks. | **Achievable with architectural investment** — keep as-is |
+| Route disconnection & reconnection | Handle multiple disconnected route segments | Each segment needs independent satellite geo-referencing. Segments are stitched via common satellite reference frame. Similar to loop closure in SLAM but via external reference. | High complexity — core architectural challenge. +2-3 weeks for segment management. | **Achievable** — this should be a core design principle, not an edge case |
+| User input fallback (20% of route) | User provides GPS when system cannot determine | Simple UI interaction — user clicks approximate position on map. Becomes new anchor point. | Low cost — straightforward feature. | **Achievable** — keep as-is |
+| Processing speed: <5s per image | 5 seconds maximum per image | SuperPoint: ~50-100ms. LightGlue: ~20-50ms. Satellite crop+match: ~200-500ms. Full pipeline: ~500ms-2s on RTX 2060. NaviLoc runs 9 FPS on Raspberry Pi 5. ORB-SLAM3 with GPU: 30 FPS on Jetson TX2. | Low risk — well within budget on RTX 2060+. | **Easily achievable** — could target <2s. Keep 5s as safety margin |
+| Real-time streaming via SSE | Results appear immediately, refinement sent later | Standard architecture pattern. Process-and-stream is well-supported. | Low cost — standard web engineering. | **Achievable** — keep as-is |
+| Image Registration Rate > 95% | >95% of images successfully registered | ITU thesis: 93% SIM matching. With 60-73% consecutive overlap and deep learning features, >95% for VO between consecutive frames is achievable. The 5% tolerance covers sharp turns. | Medium cost — depends on feature matcher quality and satellite image quality. | **Achievable** — but interpret as "95% for normal consecutive frames". Sharp turn frames counted separately. |
+| MRE < 1.0 pixels | Mean Reprojection Error below 1 pixel | Sub-pixel accuracy is standard for SuperPoint/LightGlue. SVO achieves sub-pixel via direct methods. Typical range: 0.3-0.8 pixels. | No additional cost — inherent to modern matchers. | **Easily achievable** — keep as-is |
+| REST API + SSE background service | Always-running service, start on request, stream results | Standard Python (FastAPI) or .NET architecture. | Low cost — standard engineering. ~1 week for API layer. | **Achievable** — keep as-is |
+
+## Restrictions Assessment
+
+| Restriction | Our Values | Researched Values | Cost/Timeline Impact | Status |
+|-------------|-----------|-------------------|---------------------|--------|
+| No IMU data | No heading, no pitch/roll correction | **CRITICAL restriction.** Most published systems use IMU for heading and as fallback. Without IMU: (1) heading must be derived from consecutive frame matching or satellite matching, (2) no pitch/roll correction — rely on robust feature matchers, (3) scale from known altitude only. Adds ~30-50% error vs IMU-equipped systems. | High impact — requires visual heading estimation. All VO literature assumes at least heading from IMU. +2-3 weeks R&D for pure visual heading. | **Realistic but significantly harder.** Consider: can barometer data be available? |
+| Camera not auto-stabilized | Images have varying pitch/roll | At 400m with fixed-wing, typical roll ±15°, pitch ±10°. Causes trapezoidal distortion in images. Robust matchers (SuperPoint, LightGlue) handle moderate viewpoint changes. Homography estimation between frames compensates. | Medium impact — modern matchers handle this. Pre-rectification using estimated attitude could help. | **Realistic** — keep as-is. Mitigated by robust matchers. |
+| Google Maps only (cost-dependent) | Currently limited to Google Maps | Google Maps in eastern Ukraine may have 2-5 year old imagery. Conflict damage makes old imagery unreliable. **Risk: satellite-UAV matching may fail in areas with significant ground changes.** Alternatives: Mapbox (Maxar Vivid, sub-meter), Bing Maps (0.3-1m), Maxar SecureWatch (30cm, enterprise pricing). | High risk — may need multiple providers. Google: $200/month free credit. Mapbox: free tier for 100K requests. Maxar: enterprise pricing. | **Tighten** — add fallback provider. Pre-download tile cache for operational area. |
+| Image resolution FullHD to 6252×4168 | Variable resolution across flights | Lower resolution (FullHD=1920×1080) at 400m: GSD ≈ 0.20m/pixel, footprint ~384m × 216m. Significantly worse matching but still functional. Need to handle both extremes. | Medium impact — pipeline must be resolution-adaptive. | **Realistic** — keep. But note: FullHD accuracy will be ~3x worse than 26MP. |
+| Altitude ≤ 1km, terrain height negligible | Flat terrain assumption at known altitude | Simplifies scale estimation. At 400m, terrain variations of ±50m cause ±12.5% scale error. Eastern Ukraine is relatively flat (steppe), so this is reasonable. | Low impact for the operational area. | **Realistic** — keep as-is |
+| Mostly sunny weather | Good lighting conditions assumed | Sunny weather = good texture, consistent illumination. Shadows may cause matching issues but are manageable. | Low impact — favorable condition. | **Realistic** — keep. Add: "system performance degrades in overcast/low-light" |
+| Up to 3000 photos per flight | 500-1500 typical, 3000 maximum | At <5s per image: 3000 photos = ~4 hours max. Memory: 3000 × 26MP ≈ 78GB raw. Need efficient memory management and incremental processing. | Medium impact — requires streaming architecture and careful memory management. | **Realistic** — keep. Memory management is engineering, not research. |
+| Sharp turns with completely different next photo | Route discontinuity is possible | Most VO systems fail at 0% overlap. This is effectively a new "start point" problem. Satellite matching is the only recovery path. | High impact — already addressed in AC. | **Realistic** — this is the defining challenge |
+| Desktop/laptop with RTX 2060+ | Minimum GPU requirement | RTX 2060: 6GB VRAM, 1920 CUDA cores. Sufficient for SuperPoint, LightGlue, satellite matching. RTX 3070: 8GB VRAM, 5888 CUDA cores — significantly faster. | Low risk — hardware is adequate. | **Realistic** — keep as-is |
+
+## Missing Acceptance Criteria (Suggested Additions)
+
+| Criterion | Suggested Value | Rationale |
+|-----------|----------------|-----------|
+| Satellite imagery resolution requirement | ≥ 0.5 m/pixel, ideally 0.3 m/pixel | Matching quality depends heavily on reference imagery resolution. At GSD 6cm, satellite must be at least 0.5m for reliable cross-view matching. |
+| Confidence/uncertainty reporting | Report confidence score per position estimate | User needs to know which positions are reliable (satellite-anchored) vs uncertain (VO-only, accumulating drift). |
+| Output format | WGS84 coordinates in GeoJSON or CSV | Standardize output for downstream integration. |
+| Satellite image freshness requirement | < 2 years old for operational area | Older imagery may not match current ground truth due to conflict damage. |
+| Maximum drift between satellite corrections | < 100m cumulative VO drift before satellite re-anchor | Prevents long uncorrected VO segments from exceeding 50m target. |
+| Memory usage limit | < 16GB RAM, < 6GB VRAM | Ensures compatibility with RTX 2060 systems. |
+
+## Key Findings
+
+1. **The 50m/80% accuracy target is achievable** with a well-designed VO + satellite matching pipeline, even without IMU, given the high camera resolution (6cm GSD) and known altitude. NaviLoc achieves 19.5m at lower altitudes; our 400m altitude adds difficulty but 26MP resolution compensates.
+
+2. **The 20m/60% target is aggressive but possible** with high-quality satellite imagery (≤0.5m resolution). Consider starting with a 30m target and tightening through iteration. Performance heavily depends on satellite image quality and freshness for the operational area.
+
+3. **No IMU is the single biggest technical risk.** All published comparable systems use at least heading from IMU/magnetometer. Visual heading estimation from consecutive frames is feasible but adds noise. This restriction alone could require 2-3 extra weeks of R&D.
+
+4. **Google Maps satellite imagery for eastern Ukraine is a significant risk.** Imagery may be outdated (2-5 years) and may not reflect current ground conditions. A fallback satellite provider is strongly recommended.
+
+5. **Processing speed (<5s) is easily achievable** on RTX 2060+. Modern feature matching pipelines process in <500ms per pair. The pipeline could realistically achieve <2s per image.
+
+6. **Route disconnection handling should be the core architectural principle**, not an edge case. The system should be designed "segments-first" — each segment independently geo-referenced, then stitched.
+
+7. **Missing criterion: confidence reporting.** The user should see which positions are high-confidence (satellite-anchored) vs low-confidence (VO-extrapolated). This is critical for operational use.
+
+## Sources
+- [Source #1] Mateos-Ramirez et al. (2024) — VO + satellite correction for fixed-wing UAV
+- [Source #2] Öztürk (2025) — ORB-SLAM3 + SIM integration thesis
+- [Source #3] NaviLoc (2025) — Trajectory-level visual localization
+- [Source #4] LightGlue GitHub — Feature matching benchmarks
+- [Source #5] DALGlue (2025) — Enhanced feature matching
+- [Source #8-9] Satellite imagery coverage and pricing reports
@@ -0,0 +1,63 @@
+# Question Decomposition — AC & Restrictions Assessment
+
+## Original Question
+How realistic are the acceptance criteria and restrictions for a GPS-denied visual navigation system for fixed-wing UAV imagery?
+
+## Active Mode
+Mode A, Phase 1: AC & Restrictions Assessment
+
+## Question Type
+Knowledge Organization + Decision Support
+
+## Research Subject Boundary Definition
+
+| Dimension | Boundary |
+|-----------|----------|
+| **Platform** | Fixed-wing UAV, airplane type, not multirotor |
+| **Geography** | Eastern/southern Ukraine, left of Dnipro River (conflict zone, ~48.27°N, 37.38°E based on sample data) |
+| **Altitude** | ≤ 1km, sample data at 400m |
+| **Sensor** | Monocular RGB camera, 26MP, no IMU, no LiDAR |
+| **Processing** | Ground-based desktop/laptop with NVIDIA RTX 2060+ GPU |
+| **Time Window** | Current state-of-the-art (2024-2026) |
+
+## Problem Context Summary
+
+The system must determine GPS coordinates of consecutive aerial photo centers using only:
+- Known starting GPS coordinates
+- Known camera parameters (25mm focal, 23.5mm sensor, 6252×4168 resolution)
+- Known flight altitude (≤1km, sample: 400m)
+- Consecutive photos taken within ~100m of each other
+- Satellite imagery (Google Maps) for ground reference
+
+Key constraints: NO IMU data, camera not auto-stabilized, potentially outdated satellite imagery for conflict zone.
+
+**Ground Sample Distance (GSD) at 400m altitude**:
+- GSD = (400 × 23.5) / (25 × 6252) ≈ 0.060 m/pixel (6 cm/pixel)
+- Ground footprint: ~376m × 250m per image
+- Estimated consecutive overlap: 60-73% (depending on camera orientation relative to flight direction)
+
+## Sub-Questions for AC Assessment
+
+1. What GPS accuracy is achievable with VO + satellite matching at 400m altitude with 26MP camera?
+2. How does the absence of IMU affect accuracy and what compensations exist?
+3. What processing speed is achievable per image on RTX 2060+ for the required pipeline?
+4. What image registration rates are achievable with deep learning matchers?
+5. What reprojection errors are typical for modern feature matching?
+6. How do sharp turns and route disconnections affect VO systems?
+7. What satellite imagery quality is available for the operational area?
+8. What domain-specific acceptance criteria might be missing?
+
+## Timeliness Sensitivity Assessment
+
+- **Research Topic**: GPS-denied visual navigation using deep learning feature matching
+- **Sensitivity Level**: 🟠 High
+- **Rationale**: Deep learning feature matchers (SuperPoint, LightGlue, GIM) are evolving rapidly; new methods appear quarterly. Satellite imagery providers update pricing and coverage frequently.
+- **Source Time Window**: 12 months (2024-2026)
+- **Priority official sources to consult**:
+  1. LightGlue GitHub repository (cvg/LightGlue)
+  2. ORB-SLAM3 documentation
+  3. Recent MDPI/IEEE papers on GPS-denied UAV navigation
+- **Key version information to verify**:
+  - LightGlue: Current release and performance benchmarks
+  - SuperPoint: Compatibility and inference speed
+  - ORB-SLAM3: Monocular mode capabilities
@@ -0,0 +1,133 @@
+# Source Registry
+
+## Source #1
+- **Title**: Visual Odometry in GPS-Denied Zones for Fixed-Wing UAV with Reduced Accumulative Error Based on Satellite Imagery
+- **Link**: https://www.mdpi.com/2076-3417/14/16/7420
+- **Tier**: L1
+- **Publication Date**: 2024-08-22
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: Fixed-wing UAV navigation researchers
+- **Research Boundary Match**: ✅ Full match (fixed-wing, high altitude, satellite matching)
+- **Summary**: VO + satellite image correction achieves 142.88m mean error over 17km at >1000m altitude using ORB + AKAZE. Uses IMU for heading and barometer for altitude. Error rate 0.83% of total distance.
+- **Related Sub-question**: 1, 2
+
+## Source #2
+- **Title**: Optimized visual odometry and satellite image matching-based localization for UAVs in GPS-denied environments (ITU Thesis)
+- **Link**: https://polen.itu.edu.tr/items/1fe1e872-7cea-44d8-a8de-339e4587bee6
+- **Tier**: L1
+- **Publication Date**: 2025
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: UAV navigation researchers
+- **Research Boundary Match**: ⚠️ Partial overlap (multirotor at 30-100m, but same VO+SIM methodology)
+- **Summary**: ORB-SLAM3 + SuperPoint/SuperGlue/GIM achieves GPS-level accuracy. VO module: ±2m local accuracy. SIM module: 93% matching success rate. Demonstrated on DJI Mavic Air 2 at 30-100m.
+- **Related Sub-question**: 1, 2, 4
+
+## Source #3
+- **Title**: NaviLoc: Trajectory-Level Visual Localization for GNSS-Denied UAV Navigation
+- **Link**: https://www.mdpi.com/2504-446X/10/2/97
+- **Tier**: L1
+- **Publication Date**: 2025-12
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: UAV navigation / VPR researchers
+- **Research Boundary Match**: ⚠️ Partial overlap (50-150m altitude, uses VIO not pure VO)
+- **Summary**: Achieves 19.5m Mean Localization Error at 50-150m altitude. Runs at 9 FPS on Raspberry Pi 5. 16x improvement over AnyLoc-VLAD, 32x over raw VIO drift. Training-free system.
+- **Related Sub-question**: 1, 7
+
+## Source #4
+- **Title**: LightGlue: Local Feature Matching at Light Speed (GitHub + ICCV 2023)
+- **Link**: https://github.com/cvg/LightGlue
+- **Tier**: L1
+- **Publication Date**: 2023 (actively maintained through 2025)
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: Computer vision practitioners
+- **Research Boundary Match**: ✅ Full match (core component)
+- **Summary**: ~20-34ms per image pair on RTX 2080Ti. Adaptive pruning for fast inference. 2-4x speedup with PyTorch compilation.
+- **Related Sub-question**: 3, 4
+
+## Source #5
+- **Title**: Efficient image matching for UAV visual navigation via DALGlue
+- **Link**: https://www.nature.com/articles/s41598-025-21602-5
+- **Tier**: L1
+- **Publication Date**: 2025
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: UAV navigation researchers
+- **Research Boundary Match**: ✅ Full match
+- **Summary**: DALGlue achieves 11.8% improvement over LightGlue on matching accuracy. Uses dual-tree complex wavelet preprocessing + linear attention for real-time performance.
+- **Related Sub-question**: 3, 4
+
+## Source #6
+- **Title**: Deep-UAV SLAM: SuperPoint and SuperGlue enhanced SLAM
+- **Link**: https://isprs-archives.copernicus.org/articles/XLVIII-1-W5-2025/177/2025/
+- **Tier**: L1
+- **Publication Date**: 2025
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: UAV SLAM researchers
+- **Research Boundary Match**: ✅ Full match
+- **Summary**: Replacing ORB-SLAM3's ORB features with SuperPoint+SuperGlue improved robustness and accuracy in aerial RGB scenarios.
+- **Related Sub-question**: 4, 5
+
+## Source #7
+- **Title**: SCAR: Satellite Imagery-Based Calibration for Aerial Recordings
+- **Link**: https://arxiv.org/html/2602.16349v1
+- **Tier**: L1
+- **Publication Date**: 2026-02
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: Aerial/satellite vision researchers
+- **Research Boundary Match**: ✅ Full match
+- **Summary**: Long-term auto-calibration refinement by aligning aerial images with 2D-3D correspondences from orthophotos and elevation models.
+- **Related Sub-question**: 1, 5
+
+## Source #8
+- **Title**: Google Maps satellite imagery coverage and update frequency
+- **Link**: https://ongeo-intelligence.com/blog/how-often-does-google-maps-update-satellite-images
+- **Tier**: L3
+- **Publication Date**: 2025
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: GIS practitioners
+- **Research Boundary Match**: ✅ Full match
+- **Summary**: Conflict zones like eastern Ukraine face 2-5+ year update cycles. Imagery may be intentionally limited or blurred.
+- **Related Sub-question**: 7
+
+## Source #9
+- **Title**: Satellite Mapping Services comparison 2025
+- **Link**: https://ts2.tech/en/exploring-the-world-from-above-top-satellite-mapping-services-for-web-mobile-in-2025/
+- **Tier**: L3
+- **Publication Date**: 2025
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: Developers, GIS practitioners
+- **Research Boundary Match**: ✅ Full match
+- **Summary**: Google: $200/month free credit, sub-meter resolution. Mapbox: Maxar imagery, generous free tier. Maxar SecureWatch: 30cm resolution, enterprise pricing. Planet: daily 3-4m imagery.
+- **Related Sub-question**: 7
+
+## Source #10
+- **Title**: Scale Estimation for Monocular Visual Odometry Using Reliable Camera Height
+- **Link**: https://ieeexplore.ieee.org/document/9945178/
+- **Tier**: L1
+- **Publication Date**: 2022
+- **Timeliness Status**: ✅ Currently valid (fundamental method)
+- **Target Audience**: VO researchers
+- **Research Boundary Match**: ✅ Full match
+- **Summary**: Known camera height/altitude resolves scale ambiguity in monocular VO. Essential for systems without IMU.
+- **Related Sub-question**: 2
+
+## Source #11
+- **Title**: Cross-View Geo-Localization benchmarks (SSPT, MA metrics)
+- **Link**: https://www.mdpi.com/1424-8220/24/12/3719
+- **Tier**: L1
+- **Publication Date**: 2024
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: VPR/geo-localization researchers
+- **Research Boundary Match**: ⚠️ Partial overlap (general cross-view, not UAV-specific)
+- **Summary**: SSPT achieved 84.40% RDS on UL14 dataset. MA improvements: +12% at 3m, +12% at 5m, +10% at 20m thresholds.
+- **Related Sub-question**: 1
+
+## Source #12
+- **Title**: ORB-SLAM3 GPU Acceleration Performance
+- **Link**: https://arxiv.org/html/2509.10757v1
+- **Tier**: L1
+- **Publication Date**: 2025
+- **Timeliness Status**: ✅ Currently valid
+- **Target Audience**: SLAM/VO engineers
+- **Research Boundary Match**: ✅ Full match
+- **Summary**: GPU acceleration achieves 2.8x speedup on desktop systems. 30 FPS achievable on Jetson TX2. Feature extraction up to 3x speedup with CUDA.
+- **Related Sub-question**: 3
@@ -0,0 +1,121 @@
+# Fact Cards
+
+## Fact #1
+- **Statement**: VO + satellite image correction achieves ~142.88m mean error over 17km flight at >1000m altitude using ORB features and AKAZE satellite matching. Error rate: 0.83% of total distance. This system uses IMU for heading and barometer for altitude.
+- **Source**: Source #1 — https://www.mdpi.com/2076-3417/14/16/7420
+- **Phase**: Phase 1
+- **Target Audience**: Fixed-wing UAV at high altitude (>1000m)
+- **Confidence**: ✅ High (peer-reviewed, real-world flight data)
+- **Related Dimension**: GPS accuracy, drift correction
+
+## Fact #2
+- **Statement**: ORB-SLAM3 monocular mode with optimized parameters achieves ±2m local accuracy for visual odometry. Scale ambiguity and drift remain for long flights.
+- **Source**: Source #2 — ITU Thesis
+- **Phase**: Phase 1
+- **Target Audience**: UAV navigation (30-100m altitude, multirotor)
+- **Confidence**: ✅ High (thesis with experimental validation)
+- **Related Dimension**: VO accuracy, scale ambiguity
+
+## Fact #3
+- **Statement**: Combined VO + Satellite Image Matching (SIM) with SuperPoint/SuperGlue/GIM achieves 93% matching success rate and "GPS-level accuracy" at 30-100m altitude.
+- **Source**: Source #2 — ITU Thesis
+- **Phase**: Phase 1
+- **Target Audience**: Low-altitude UAV (30-100m)
+- **Confidence**: ✅ High
+- **Related Dimension**: Registration rate, satellite matching
+
+## Fact #4
+- **Statement**: NaviLoc achieves 19.5m Mean Localization Error at 50-150m altitude, runs at 9 FPS on Raspberry Pi 5. 16x improvement over AnyLoc-VLAD. Training-free system.
+- **Source**: Source #3 — NaviLoc paper
+- **Phase**: Phase 1
+- **Target Audience**: Low-altitude UAV (50-150m) in rural areas
+- **Confidence**: ✅ High (peer-reviewed)
+- **Related Dimension**: GPS accuracy, processing speed
+
+## Fact #5
+- **Statement**: LightGlue inference: ~20-34ms per image pair on RTX 2080Ti for 1024 keypoints. 2-4x speedup possible with PyTorch compilation and TensorRT.
+- **Source**: Source #4 — LightGlue GitHub Issues
+- **Phase**: Phase 1
+- **Target Audience**: All GPU-accelerated vision systems
+- **Confidence**: ✅ High (official repository benchmarks)
+- **Related Dimension**: Processing speed
+
+## Fact #6
+- **Statement**: SuperPoint+SuperGlue replacing ORB features in SLAM improves robustness and accuracy for aerial RGB imagery over classical handcrafted features.
+- **Source**: Source #6 — ISPRS 2025
+- **Phase**: Phase 1
+- **Target Audience**: UAV SLAM researchers
+- **Confidence**: ✅ High (peer-reviewed)
+- **Related Dimension**: Feature matching quality
+
+## Fact #7
+- **Statement**: Eastern Ukraine / conflict zones may have 2-5+ year old satellite imagery on Google Maps. Imagery may be intentionally limited, blurred, or restricted for security reasons.
+- **Source**: Source #8
+- **Phase**: Phase 1
+- **Target Audience**: Ukraine conflict zone operations
+- **Confidence**: ⚠️ Medium (general reporting, not Ukraine-specific verification)
+- **Related Dimension**: Satellite imagery quality
+
+## Fact #8
+- **Statement**: Maxar SecureWatch offers 30cm resolution with ~3M km² new imagery daily. Mapbox uses Maxar's Vivid imagery with sub-meter resolution. Google Maps offers sub-meter detail in urban areas but 1-3m in rural areas.
+- **Source**: Source #9
+- **Phase**: Phase 1
+- **Target Audience**: All satellite imagery users
+- **Confidence**: ✅ High
+- **Related Dimension**: Satellite providers, cost
+
+## Fact #9
+- **Statement**: Known camera height/altitude resolves scale ambiguity in monocular VO. The pixel-to-meter conversion is s = H / f × sensor_pixel_size, enabling metric reconstruction without IMU.
+- **Source**: Source #10
+- **Phase**: Phase 1
+- **Target Audience**: Monocular VO systems
+- **Confidence**: ✅ High (fundamental geometric relationship)
+- **Related Dimension**: No-IMU compensation
+
+## Fact #10
+- **Statement**: Camera heading (yaw) can be estimated from consecutive frame feature matching by decomposing the homography or essential matrix. Pitch/roll can be estimated from horizon detection or vanishing points. Without IMU, these estimates are noisier but functional.
+- **Source**: Multiple vision-based heading estimation papers
+- **Phase**: Phase 1
+- **Target Audience**: Vision-only navigation systems
+- **Confidence**: ⚠️ Medium (well-established but accuracy varies)
+- **Related Dimension**: No-IMU compensation
+
+## Fact #11
+- **Statement**: GSD at 400m with 25mm/23.5mm sensor/6252px = 6.01 cm/pixel. Ground footprint: 376m × 250m. At 100m photo interval, consecutive overlap is 60-73%.
+- **Source**: Calculated from problem data using standard GSD formula
+- **Phase**: Phase 1
+- **Target Audience**: This specific system
+- **Confidence**: ✅ High (deterministic calculation)
+- **Related Dimension**: Image coverage, overlap
+
+## Fact #12
+- **Statement**: GPU-accelerated ORB-SLAM3 achieves 2.8x speedup on desktop systems. 30 FPS possible on Jetson TX2. Feature extraction speedup up to 3x with CUDA-optimized pipelines.
+- **Source**: Source #12
+- **Phase**: Phase 1
+- **Target Audience**: GPU-equipped systems
+- **Confidence**: ✅ High
+- **Related Dimension**: Processing speed
+
+## Fact #13
+- **Statement**: Without IMU, the Mateos-Ramirez paper (Source #1) would lose: (a) yaw angle for rotation compensation, (b) fallback when feature matching fails. Their 142.88m error would likely be significantly higher without IMU heading data.
+- **Source**: Inference from Source #1 methodology
+- **Phase**: Phase 1
+- **Target Audience**: This specific system
+- **Confidence**: ⚠️ Medium (reasoned inference)
+- **Related Dimension**: No-IMU impact
+
+## Fact #14
+- **Statement**: DALGlue achieves 11.8% improvement over LightGlue on matching accuracy while maintaining real-time performance through dual-tree complex wavelet preprocessing and linear attention.
+- **Source**: Source #5
+- **Phase**: Phase 1
+- **Target Audience**: Feature matching systems
+- **Confidence**: ✅ High (peer-reviewed, 2025)
+- **Related Dimension**: Feature matching quality
+
+## Fact #15
+- **Statement**: Cross-view geo-localization benchmarks show MA@20m improving by +10% with latest methods (SSPT). RDS metric at 84.40% indicates reliable spatial positioning.
+- **Source**: Source #11
+- **Phase**: Phase 1
+- **Target Audience**: Cross-view matching researchers
+- **Confidence**: ✅ High
+- **Related Dimension**: Cross-view matching accuracy
@@ -0,0 +1,115 @@
+# Comparison Framework
+
+## Selected Framework Type
+Decision Support (component-by-component solution comparison)
+
+## System Components
+1. Visual Odometry (consecutive frame matching)
+2. Satellite Image Geo-Referencing (cross-view matching)
+3. Heading & Orientation Estimation (without IMU)
+4. Drift Correction & Position Fusion
+5. Segment Management & Route Reconnection
+6. Interactive Point-to-GPS Lookup
+7. Pipeline Orchestration & API
+
+---
+
+## Component 1: Visual Odometry
+
+| Solution | Tools | Advantages | Limitations | Fit |
+|----------|-------|-----------|-------------|-----|
+| ORB-SLAM3 monocular | ORB features, BA, map management | Mature, well-tested, handles loop closure. GPU-accelerated. 30FPS on Jetson TX2. | Scale ambiguity without IMU. Over-engineered for sequential aerial — map building not needed. Heavy dependency. | Medium — too complex for the use case |
+| Homography-based VO with SuperPoint+LightGlue | SuperPoint, LightGlue, OpenCV homography | Ground plane assumption perfect for flat terrain at 400m. Cleanly separates rotation/translation. Known altitude resolves scale directly. Fast. | Assumes planar scene (valid for our case). Fails at sharp turns (but that's expected). | **Best fit** — matches constraints exactly |
+| Optical flow VO | cv2.calcOpticalFlowPyrLK or RAFT | Dense motion field, no feature extraction needed. | Less accurate for large motions. Struggles with texture-sparse areas. No inherent rotation estimation. | Low — not suitable for 100m baselines |
+| Direct method (SVO) | SVO Pro | Sub-pixel precision, fast. | Designed for small baselines and forward cameras. Poor for downward aerial at large baselines. | Low |
+
+**Selected**: Homography-based VO with SuperPoint + LightGlue features
+
+---
+
+## Component 2: Satellite Image Geo-Referencing
+
+| Solution | Tools | Advantages | Limitations | Fit |
+|----------|-------|-----------|-------------|-----|
+| SuperPoint + LightGlue cross-view matching | SuperPoint, LightGlue, perspective warp | Best overall performance on satellite stereo benchmarks. Fast (~50ms matching). Rotation-invariant. Handles viewpoint/scale changes. | Requires perspective warping to reduce viewpoint gap. Needs good satellite image quality. | **Best fit** — proven on satellite imagery |
+| SuperPoint + SuperGlue + GIM | SuperPoint, SuperGlue, GIM | GIM adds generalization for challenging scenes. 93% match rate (ITU thesis). | SuperGlue slower than LightGlue. GIM adds complexity. | Good — slightly better robustness, slower |
+| LoFTR (detector-free) | LoFTR | No keypoint detection step. Works on low-texture. | Slower than detector-based methods. Fixed resolution (coarse). Less accurate than SuperPoint+LightGlue on satellite benchmarks. | Medium — fallback option |
+| DUSt3R/MASt3R | DUSt3R/MASt3R | Handles extreme viewpoints and low overlap. +50% completeness over COLMAP in sparse scenarios. | Very slow. Designed for 3D reconstruction not 2D matching. Unreliable with many images. | Low — only for extreme fallback |
+| Terrain-weighted optimization (YFS90) | Custom pipeline + DEM | <7m MAE without IMU! Drift-free. Handles thermal IR. 20 scenarios validated. | Requires DEM data. More complex implementation. Not open-source matching details. | High — architecture inspiration |
+
+**Selected**: SuperPoint + LightGlue (primary) with perspective warping. GIM as supplementary for difficult matches. YFS90-style terrain-weighted sliding window for position optimization.
+
+---
+
+## Component 3: Heading & Orientation Estimation
+
+| Solution | Tools | Advantages | Limitations | Fit |
+|----------|-------|-----------|-------------|-----|
+| Homography decomposition (consecutive frames) | OpenCV decomposeHomographyMat | Directly gives rotation between frames. Works with ground plane assumption. No extra sensors needed. | Accumulates heading drift over time. Noisy for small motions. Ambiguous decomposition (need to select correct solution). | **Best fit** — primary heading source |
+| Satellite matching absolute orientation | From satellite match homography | Provides absolute heading correction. Eliminates accumulated heading drift. | Only available when satellite match succeeds. Intermittent. | **Best fit** — drift correction for heading |
+| Optical flow direction | Dense flow vectors | Simple to compute. | Very noisy at high altitude. Unreliable for heading. | Low |
+
+**Selected**: Homography decomposition for frame-to-frame heading + satellite matching for periodic absolute heading correction.
+
+---
+
+## Component 4: Drift Correction & Position Fusion
+
+| Solution | Tools | Advantages | Limitations | Fit |
+|----------|-------|-----------|-------------|-----|
+| Kalman filter (EKF/UKF) | filterpy or custom | Well-understood. Handles noisy measurements. Good for fusing VO + satellite. | Assumes Gaussian noise. Linearization issues with EKF. | Good — simple and effective |
+| Sliding window optimization with terrain constraints | Custom optimization, scipy.optimize | YFS90 achieves <7m with this. Directly constrains drift. No loop closure needed. | More complex to implement. Needs tuning. | **Best fit** — proven for this exact problem |
+| Pose graph optimization | g2o, GTSAM | Standard in SLAM. Handles satellite anchors as prior factors. Globally optimal. | Heavy dependency. Over-engineered if segments are short. | Medium — overkill unless routes are very long |
+| Simple anchor reset | Direct correction at satellite match | Simplest. Just replace VO position with satellite position. | Discontinuous trajectory. No smoothing. | Low — too crude |
+
+**Selected**: Sliding window optimization with terrain constraints (inspired by YFS90), with Kalman filter as simpler fallback. Satellite matches as absolute anchor constraints.
+
+---
+
+## Component 5: Segment Management & Route Reconnection
+
+| Solution | Tools | Advantages | Limitations | Fit |
+|----------|-------|-----------|-------------|-----|
+| Segments-first architecture with satellite anchoring | Custom segment manager | Each segment independently geo-referenced. No dependency between disconnected segments. Natural handling of sharp turns. | Needs robust satellite matching per segment. Segments without any satellite match are "floating". | **Best fit** — matches AC requirement for core strategy |
+| Global pose graph with loop closure | g2o/GTSAM | Can connect segments when they revisit same area. | Heavy. Doesn't help if segments don't overlap with each other. | Low — segments may not revisit same areas |
+| Trajectory-level VPR (NaviLoc-style) | VPR + trajectory optimization | Global optimization across trajectory. | Requires pre-computed VPR database. Complex. Designed for continuous trajectory, not disconnected segments. | Low |
+
+**Selected**: Segments-first architecture. Each segment starts from a satellite anchor or user input. Segments connected through shared satellite coordinate frame.
+
+---
+
+## Component 6: Interactive Point-to-GPS Lookup
+
+| Solution | Tools | Advantages | Limitations | Fit |
+|----------|-------|-----------|-------------|-----|
+| Homography projection (image → ground) | Computed homography from satellite match | Already computed during geo-referencing. Accurate for flat terrain. | Only works for images with successful satellite match. | **Best fit** |
+| Camera ray-casting with known altitude | Camera intrinsics + pose estimate | Works for any image with pose estimate. Simpler math. | Accuracy depends on pose estimate quality. | Good — fallback for non-satellite-matched images |
+
+**Selected**: Homography projection (primary) + ray-casting (fallback).
+
+---
+
+## Component 7: Pipeline & API
+
+| Solution | Tools | Advantages | Limitations | Fit |
+|----------|-------|-----------|-------------|-----|
+| Python FastAPI + SSE | FastAPI, EventSourceResponse, asyncio | Native SSE support (since 0.135.0). Async GPU pipeline. Excellent for ML/CV workloads. Rich ecosystem. | Python GIL (mitigated with async/multiprocessing). | **Best fit** — natural for CV/ML pipeline |
+| .NET ASP.NET Core + SSE | ASP.NET Core, SignalR | High performance. Good for enterprise. | Less natural for CV/ML. Python interop needed for PyTorch models. Adds complexity. | Low — unnecessary indirection |
+| Python + gRPC streaming | gRPC | Efficient binary protocol. Bidirectional streaming. | More complex client integration. No browser-native support. | Medium — overkill for this use case |
+
+**Selected**: Python FastAPI with SSE.
+
+---
+
+## Google Maps Tile Resolution at Latitude 48° (Operational Area)
+
+| Zoom Level | Meters/pixel | Tile coverage (256px) | Tiles for 20km² | Download size est. |
+|-----------|-------------|----------------------|-----------------|-------------------|
+| 17 | 0.80 m/px | ~205m × 205m | ~500 tiles | ~20MB |
+| 18 | 0.40 m/px | ~102m × 102m | ~2,000 tiles | ~80MB |
+| 19 | 0.20 m/px | ~51m × 51m | ~8,000 tiles | ~320MB |
+| 20 | 0.10 m/px | ~26m × 26m | ~30,000 tiles | ~1.2GB |
+
+Formula: metersPerPx = 156543.03 × cos(48° × π/180) / 2^zoom ≈ 104,771 / 2^zoom
+
+**Selected**: Zoom 18 (0.40 m/px) as primary matching resolution. Zoom 19 (0.20 m/px) for refinement if available. Meets the ≥0.5 m/pixel AC requirement.
@@ -0,0 +1,146 @@
+# Reasoning Chain
+
+## Dimension 1: GPS Accuracy (50m/80%, 20m/60%)
+
+### Fact Confirmation
+- YFS90 system achieves <7m MAE without IMU (Fact from Source DOAJ/GitHub)
+- NaviLoc achieves 19.5m MLE at 50-150m altitude (Fact #4)
+- Mateos-Ramirez achieves 143m mean error at >1000m altitude with IMU (Fact #1)
+- Our GSD is 6cm/pixel at 400m altitude (Fact #11)
+- ITU thesis achieves GPS-level accuracy with VO+SIM at 30-100m (Fact #3)
+
+### Reference Comparison
+- At 400m altitude, our camera produces much higher resolution imagery than typical systems
+- YFS90 at <7m without IMU is the strongest reference — uses terrain-weighted constraint optimization
+- NaviLoc at 19.5m uses trajectory-level optimization but at lower altitude
+- The combination of VO + satellite matching with sliding window optimization should achieve 10-30m depending on satellite image quality
+
+### Conclusion
+- **50m / 80%**: High confidence achievable. Multiple systems achieve better than this.
+- **20m / 60%**: Achievable with good satellite imagery. YFS90 achieves <7m. Our higher altitude makes cross-view matching harder, but 26MP camera compensates.
+- **10m stretch**: Possible with zoom 19 satellite tiles (0.2m/px) and terrain-weighted optimization.
+
+### Confidence: ✅ High for 50m, ⚠️ Medium for 20m, ❓ Low for 10m
+
+---
+
+## Dimension 2: No-IMU Heading Estimation
+
+### Fact Confirmation
+- Homography decomposition gives rotation between frames for planar scenes (multiple sources)
+- Ground plane assumption is valid for flat terrain (eastern Ukraine steppe)
+- Satellite matching provides absolute orientation correction (Sources #1, #2)
+- YFS90 achieves <7m without requiring IMU (Source #3 DOAJ)
+
+### Reference Comparison
+- Most published systems use IMU for heading — our approach is less common
+- YFS90 proves it's possible without IMU, but uses DEM data for terrain weighting
+- The key insight: satellite matching provides both position AND heading correction, making intermittent heading drift from VO acceptable
+
+### Conclusion
+Heading estimation from homography decomposition between consecutive frames + periodic satellite matching correction is viable. The frame-to-frame heading drift accumulates, but satellite corrections at regular intervals (every 5-20 frames) reset it. The flat terrain of the operational area makes the ground plane assumption reliable.
+
+### Confidence: ⚠️ Medium — novel approach but supported by YFS90 results
+
+---
+
+## Dimension 3: Processing Speed (<5s per image)
+
+### Fact Confirmation
+- LightGlue: ~20-50ms per pair (Fact #5)
+- SuperPoint extraction: ~50-100ms per image
+- GPU-accelerated ORB-SLAM3: 30 FPS (Fact #12)
+- NaviLoc: 9 FPS on Raspberry Pi 5 (Fact #4)
+
+### Pipeline Time Budget Estimate (per image on RTX 2060)
+1. SuperPoint feature extraction: ~80ms
+2. LightGlue VO matching (vs previous frame): ~40ms
+3. Homography estimation + position update: ~5ms
+4. Satellite tile crop (from cache): ~10ms
+5. SuperPoint extraction on satellite crop: ~80ms
+6. LightGlue satellite matching: ~60ms
+7. Position correction + sliding window optimization: ~20ms
+8. Total: ~295ms ≈ 0.3s
+
+### Conclusion
+Processing comfortably fits within 5s budget. Even with additional overhead (satellite tile download, perspective warping, GIM fallback), the pipeline stays under 2s. The 5s budget provides ample margin.
+
+### Confidence: ✅ High
+
+---
+
+## Dimension 4: Sharp Turns & Route Disconnection
+
+### Fact Confirmation
+- At <5% overlap, consecutive feature matching will fail
+- Satellite matching can provide absolute position independently of VO
+- DUSt3R/MASt3R handle extreme low overlap (+50% completeness vs COLMAP)
+- YFS90 handles positioning failures with re-localization
+
+### Reference Comparison
+- Traditional VO systems fail at sharp turns — this is expected and acceptable
+- The segments-first architecture treats each continuous VO chain as a segment
+- Satellite matching re-localizes at the start of each new segment
+- If satellite matching fails too → wider search area → user input
+
+### Conclusion
+The system should not try to match across sharp turns. Instead:
+1. Detect VO failure (low match count / high reprojection error)
+2. Start new segment
+3. Attempt satellite geo-referencing for new segment start
+4. Each segment is independently positioned in the global satellite coordinate frame
+
+This is architecturally simpler and more robust than trying to bridge disconnections.
+
+### Confidence: ✅ High
+
+---
+
+## Dimension 5: Satellite Image Matching Reliability
+
+### Fact Confirmation
+- Google Maps at zoom 18: 0.40 m/px at lat 48° — meets AC requirement
+- Eastern Ukraine imagery may be 2-5 years old (Fact #7)
+- SuperPoint+LightGlue is best performer for satellite matching (Source comparison study)
+- Perspective warping improves cross-view matching significantly
+- 93% match rate achieved in ITU thesis (Fact #3)
+
+### Reference Comparison
+- The main risk is satellite image freshness in conflict zone
+- Natural terrain features (rivers, forests, field boundaries) are relatively stable over years
+- Man-made features (buildings, roads) may change due to conflict
+- Agricultural field patterns change seasonally
+
+### Conclusion
+Satellite matching will work reliably in areas with stable natural features. Performance degrades in:
+1. Areas with significant conflict damage (buildings destroyed)
+2. Areas with seasonal agricultural changes
+3. Areas with very homogeneous texture (large uniform fields)
+
+Mitigation: use multiple scale levels, widen search area, accept lower confidence.
+
+### Confidence: ⚠️ Medium — depends heavily on operational area characteristics
+
+---
+
+## Dimension 6: Architecture Selection
+
+### Fact Confirmation
+- YFS90 architecture (VO + satellite matching + terrain-weighted optimization) achieves <7m
+- ITU thesis architecture (ORB-SLAM3 + SIM) achieves GPS-level accuracy
+- NaviLoc architecture (VPR + trajectory optimization) achieves 19.5m
+
+### Reference Comparison
+- YFS90 is closest to our requirements: no IMU, satellite matching, drift correction
+- Our system adds: segment management, real-time streaming, user fallback
+- We need simpler VO than ORB-SLAM3 (no map building needed)
+- We need faster matching than SuperGlue (LightGlue preferred)
+
+### Conclusion
+Hybrid architecture combining:
+- YFS90-style sliding window optimization for drift correction
+- SuperPoint + LightGlue for both VO and satellite matching (unified feature pipeline)
+- Segments-first architecture for disconnection handling
+- FastAPI + SSE for real-time streaming
+
+### Confidence: ✅ High
@@ -0,0 +1,57 @@
+# Validation Log
+
+## Validation Scenario
+Using the provided sample data: 60 consecutive images from a flight starting at (48.275292, 37.385220) heading generally south-southwest. Camera: 26MP at 400m altitude.
+
+## Expected Behavior Based on Conclusions
+
+### Normal consecutive frames (AD000001-AD000032)
+- VO successfully matches consecutive frames (60-73% overlap)
+- Satellite matching every 5-10 frames provides absolute correction
+- Position error stays within 20-50m corridor around ground truth
+- Heading estimated from homography, corrected by satellite matching
+
+### Apparent maneuver zone (AD000033-AD000048)
+- The coordinates show the UAV making a complex turn around images 33-48
+- Some consecutive pairs may have low overlap → VO quality drops
+- Satellite matching becomes the primary position source
+- New segments may be created if VO fails completely
+- Position confidence drops in this zone
+
+### Return to straight flight (AD000049-AD000060)
+- VO re-establishes strong consecutive matching
+- Satellite matching re-anchors position
+- Accuracy returns to normal levels
+
+## Actual Validation (Calculated)
+
+Distances between consecutive samples in the data:
+- AD000001→002: ~180m (larger than stated 100m — likely exaggeration in problem description)
+- AD000002→003: ~115m
+- Typical gap: 80-180m
+- At 376m footprint width and 250m height, even 180m gap gives 52-73% overlap → sufficient for VO
+
+At the turn zone (images 33-48):
+- AD000041→042: ~230m with direction change → overlap may drop to 30-40%
+- AD000042→043: ~230m with direction change → overlap may drop significantly
+- AD000045→046: ~160m with direction change → may be <20% overlap
+- These transitions are where VO may fail → satellite matching needed
+
+## Counterexamples
+
+1. **Homogeneous terrain**: If a section of the flight is over large uniform agricultural fields with no distinguishing features, both VO and satellite matching may fail. Mitigation: use higher zoom satellite tiles, rely on VO with lower confidence.
+
+2. **Conflict-damaged area**: If satellite imagery shows pre-war structures that no longer exist, satellite matching will produce incorrect position estimates. Mitigation: confidence scoring will flag inconsistent matches.
+
+3. **FullHD resolution flight**: At GSD 20cm/pixel instead of 6cm, matching quality degrades ~3x. The 50m target may still be achievable but 20m will be very difficult.
+
+## Review Checklist
+- [x] Draft conclusions consistent with fact cards
+- [x] No important dimensions missed
+- [x] No over-extrapolation
+- [x] Issue found: The problem states "within 100 meters of each other" but actual data shows 80-230m. Pipeline must handle larger baselines.
+- [x] Issue found: Tile download strategy needs to handle unknown route direction — progressive expansion needed.
+
+## Conclusions Requiring Revision
+- Photo spacing is 80-230m not strictly 100m — increases the range of overlap variations. Still functional but wider variance than assumed.
+- Route direction is unknown at start — satellite tile pre-loading must use expanding radius strategy, not directional pre-loading.