Files
Oleksandr Bezdieniezhnykh 846670a5c5 Refactor documentation for splittable artifacts and update references
Updated various documentation files to clarify the handling of splittable artifacts, allowing for folder equivalents of key markdown files when they exceed size limits. Adjusted references in multiple sections to reflect this new structure, ensuring consistency across the research methodology. Enhanced clarity on the saving actions and artifact organization, particularly for `01_source_registry.md`, `02_fact_cards.md`, and `06_component_fit_matrix.md`. This change aims to improve usability and maintainability of the research documentation.
2026-05-08 23:39:30 +03:00

229 KiB
Raw Permalink Blame History

Fact Cards — C3: Cross-domain registration (Matchers)

Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Extracted from sources logged in ../01_source_registry/C3_matchers.md (see ../01_source_registry/00_summary.md for index). Confidence labels: High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), Low (L3/L4 inferential). Bound to sub-questions in ../00_question_decomposition.md.

Index: ../00_summary.md. Sibling categories: SQ6 (FC external positioning), SQ1 (existing systems), SQ2 (canonical pipeline), C1 (VIO), C2 (VPR).

Facts in this file: SP+LightGlue, DISK+LightGlue, ALIKED+LightGlue, XFeat, SuperGlue+SuperPoint, MASt3R, RoMa, DKM, LoFTR + Plan-phase decisions D-C3-1..D-C3-N + C3 working conclusions.


C3 — Per-Mode API Capability Verification (engine Step 2 — SP+LightGlue session entry, 2026-05-08)

  • Source: Source #69 (/cvg/lightglue context7 indexed lookup — LightGlue(features='superpoint', n_layers=9, depth_confidence=0.8, width_confidence=0.9, filter_threshold=0.1, flash=False, mp=False) constructor signature; nine canonical code snippets for pipeline + extractor + matcher + complete-matching-pipeline example), accessed 2026-05-08; Source #70 (cvg/LightGlue canonical README — from lightglue import LightGlue, SuperPoint, DISK; from lightglue.utils import load_image, rbd; extractor = SuperPoint(max_num_keypoints=2048).eval().cuda(); matcher = LightGlue(features='superpoint').eval().cuda() for canonical pipeline; default LightGlue construction parameters per canonical README (authoritative over context7 docstring): n_layers=9, flash=True (auto-detected when available), mp=False, depth_confidence=0.95 (disable with -1), width_confidence=0.99 (disable with -1), filter_threshold=0.1; PyTorch ≥2.0 enables matcher.compile(mode='reduce-overhead') for additional speedup with caveat for inputs <1536 keypoints), accessed 2026-05-08; Source #71 (canonical paper arXiv:2306.13643 / Lindenberger et al. ICCV 2023 — §3 architecture [9 transformer layers, 4 attention heads, descriptor dimension d=256, rotary positional encoding, bidirectional cross-attention, soft partial assignment matrix combining similarity + matchability scores, filter threshold τ=0.1] + §3.3 adaptive-depth/adaptive-width pruning [confidence threshold α=0.95, unmatchability threshold β=0.01, ~33% average inference-time reduction at <1% accuracy loss] + §4 training recipe [pre-train on synthetic homographies of Oxford-Paris 1M distractors 170k images 6M image pairs 2 days on 2 RTX 3090 + fine-tune on MegaDepth phototourism 368/5/24 train/val/test scenes 50 epochs 2 days on 2 RTX 3090 32 image pairs per batch with gradient checkpointing + mixed precision]); Source #72 (Magic Leap magicleap/SuperPointPretrainedNetwork LICENSE — "ACADEMIC OR NON-PROFIT ORGANIZATION NONCOMMERCIAL RESEARCH USE ONLY" Software License Agreement; HARD DISQUALIFIER for canonical SP+LightGlue pinned mode in project's dual-use deployment context); Source #73 (fabio-sim/LightGlue-ONNX companion — lightglue-onnx export | infer | trtexec CLI + FP16 mixed precision + FP8 ModelOpt workflow + FlashAttention-2 fused ONNX + MultiHead-Attention fusion + ArgMax→TopK trick ~30% speedup + Kornia kornia.feature.OnnxLightGlue integration; Jetson Orin Nano Super deployment path documented; Ampere FP8 emulation verification gate for D-C3-2)
  • Inputs in the example: Two arbitrary RGB or grayscale images at any (independent) resolutions (canonical README example uses 1024×1024 grayscale per image; load_image returns torch.Tensor[3, H, W] normalized to [0, 1] regardless of input format); SuperPoint extractor cropped output: feats: {keypoints: torch.Tensor[B, N, 2], descriptors: torch.Tensor[B, N, 256], keypoint_scores: torch.Tensor[B, N]} where N ≤ max_num_keypoints (canonical default 2048; project pinned to 1024); LightGlue matcher input: dict with image0 and image1 keys mapping to per-image SuperPoint output dicts; output: {matches0: torch.Tensor[B, N], matches1: torch.Tensor[B, N], matching_scores0: torch.Tensor[B, N], matching_scores1: torch.Tensor[B, N], matches: List[torch.Tensor[K, 2]], scores: List[torch.Tensor[K]], stop: int} where K is the number of correspondences after τ=0.1 filtering. rbd(x) helper removes the batch dimension to extract single-pair tensors; points0 = feats0['keypoints'][matches[..., 0]] and points1 = feats1['keypoints'][matches[..., 1]] produce 2D-2D correspondences directly consumable by C4 PnP+RANSAC
  • Outputs in the example: Up to 1024 2D-2D correspondences with per-correspondence confidence score s_k ∈ [τ=0.1, 1.0]; canonical paper Table 1 reports HPatches homography R=94.3 / P=88.9 with AUC-DLT@5px=78.6; canonical paper Table 2 reports MegaDepth-1500 relative pose AUC@5°/10°/20°=66.7/79.3/87.9 at 44.2 ms standard / 31.4 ms adaptive on RTX 3080 (1024 keypoints); canonical paper Table 3 reports Aachen Day-Night with NetVLAD top-50 + SP+LightGlue + PnP+RANSAC pipeline Day (0.25m,2°)/(0.5m,5°)/(1.0m,10°) = 89.2/95.4/98.5, Night = 87.8/93.9/100, throughput 17.2 pairs/sec standard / 26.1 pairs/sec optimized on RTX 3080 — direct documentary equivalence to project's intended pipeline shape; canonical RTX-3080 throughput: 150 FPS @ 1024 keypoints with compilation + adaptivity (= ~6.7 ms per pair) / 50 FPS @ 4096 keypoints (= 20 ms per pair); 4-10× speedup over SuperGlue depending on input difficulty
  • Project inputs: 1× ADTi 20MP nav frame stream (5472×3648, target 3 fps) → grayscale-converted + bilinearly downscaled-to-largest-edge 1024 → fp16 batch on Jetson Orin Nano Super; per-UAV-frame K=10 top-K retrieved satellite tiles from C2 (NetVLAD/MixVPR/SALAD/SelaVPR/EigenPlaces) → grayscale-converted + bilinearly downscaled-to-largest-edge 1024 → fp16 batch on Jetson Orin Nano Super; total per-frame compute = K=10 image pairs (UAV-frame, satellite-tile)
  • Project outputs required: Up to 1024 2D-2D correspondences per (UAV-frame, satellite-tile) image pair with confidence scores; cosine-confidence-threshold filter at 0.95 × per-pair-max-score to retain only the most confident correspondences; feeds C4 PnP+RANSAC pose estimator with 4-point minimum (typical: 30-200 inliers per successful pair after RANSAC); satisfies AC-1.1 frame-center-within-50m pose accuracy requirement when pairing with high-recall C2 retrieval (paper Table 3 documentary evidence Aachen Day Day (0.25m,2°)=89.2 = nominally satisfies AC-1.1 50m bar at 0.25m precision tier); satisfies AC-1.2 frame-center-within-20m pose accuracy requirement at tighter tolerance (paper Table 3 documentary evidence Aachen Day (1.0m,10°)=98.5); satisfies AC-2.1b satellite-anchor-registration-succeeds gate when the C3 image pair achieves >30 inliers after RANSAC (typical SP+LightGlue + RANSAC threshold); TIGHT latency-budget interaction: K=10 pairs × 30-60 ms = 300-600 ms per UAV frame on extrapolation vs AC-4.1 400 ms budget — D-C3-3 NEW Plan-phase choice required; satisfies AC-4.2 memory budget with comfortable margin (~27 MB total weights at fp16)
  • Match assessment: exact mode match for (SuperPoint MagicLeap-pretrained extractor at 1024 keypoints with 256-D descriptors, LightGlue matcher with features='superpoint', n_layers=9, depth_confidence=0.95, width_confidence=0.99, filter_threshold=0.1, flash=True, 1024×1024 grayscale input per image, up to 1024 2D-2D correspondences output with confidence scores); training+evaluation+canonical-pretrained-distribution CLIs exist in cvg/LightGlue (Source #70); five extractor-matcher sibling modes documented (SP+LightGlue, DISK+LightGlue, ALIKED+LightGlue, SIFT+LightGlue, DoGHardNet+LightGlue); companion fabio-sim/LightGlue-ONNX (Source #73) ships ONNX/TensorRT/OpenVINO/FP16/FP8 export pathway with January 2026 active maintenance; companion cvg/Hierarchical-Localization (hloc) ships canonical NetVLAD top-50 → SP+LightGlue → PnP+RANSAC end-to-end visual-localization pipeline with paper Table 3 documentary evidence equivalent to project's intended pipeline shape; ⚠️ partial input domain (canonical training on synthetic homographies of Oxford-Paris 1M distractors + MegaDepth phototourism — NOT aerial nadir; same caveat as C2 candidates; D-C2-1 retrain decision interacts with D-C3-1 extractor choice); ⚠️ Jetson Orin Nano Super export risk on SP+LightGlue (well-documented pathway via Source #73, but project must measure on Jetson Orin Nano Super); ⚠️ Jetson Orin Nano Super FP8 emulation on Ampere uncertain (Source #73 documents FP8 ModelOpt workflow on Hopper/Ada/Blackwell — Jetson Orin Nano Super is Ampere; D-C3-2 verification gate at Jetson MVE phase); HARD LICENSE DISQUALIFIER on SuperPoint canonical pretrained weights AND lightglue/superpoint.py inference file (Source #72 Magic Leap LICENSE) — blocks commercial AND dual-use deployment per project's question_decomposition.md hard disqualifier ("anything whose license blocks military / dual-use deployment"); the project's deployment context (eastern/southern Ukraine fixed-wing UAV, AC-NEW-2 spoofing-promotion path) is dual-use military by every reasonable interpretation; mitigation via D-C3-1 NEW Plan-phase decision with DISK+LightGlue (Apache-2.0 throughout) RECOMMENDED + paper Appendix A Table 6 documentary evidence DISK+LightGlue beats SP+LightGlue by +7.99 absolute AUC@5° on IMC 2020 stereo
  • If ⚠️ or : docs do not explicitly disqualify the algorithmic mode at the API or capability level. The (extractor, matcher, keypoint count, descriptor dimension, input size, normalisation, output shape) tuple is documented and runnable directly via cvg/LightGlue canonical CLI OR via HuggingFace Transformers integration OR via kornia integration OR via fabio-sim/LightGlue-ONNX ONNX/TensorRT pipeline. HOWEVER, the SuperPoint canonical pretrained weights LICENSE (Source #72 Magic Leap noncommercial-research-only Software License Agreement) is a HARD DISQUALIFIER on the canonical SP+LightGlue mode in the project's dual-use deployment context — mitigation via D-C3-1 NEW Plan-phase decision is REQUIRED before promotion to "Selected": (a) DISK+LightGlue (Apache-2.0 throughout) RECOMMENDED — paper Table 6 demonstrates technical superiority over SP+LightGlue (+7.99 absolute AUC@5° on IMC 2020 stereo); (b) ALIKED+LightGlue (BSD-3-Clause + Apache-2.0); (c) re-train SuperPoint-class extractor under permissive license (~1-4 weeks engineering + retrain-on-aerial-nadir option); (d) accept Magic Leap noncommercial-research license for project's R&D phase only with explicit Plan-phase swap commitment (legally risky). → Status: Documentary lead with Apache-2.0 matcher + Magic-Leap-restrictive-extractor-weights HARD DISQUALIFIER on canonical SP+LightGlue + DISK+LightGlue mitigation RECOMMENDED + adaptive-depth/adaptive-width pruning advantage + Apache-2.0 license-track placement on matcher + actively-maintained Jetson ONNX/TensorRT/FP16/FP8 export pathway (FP8 Ampere emulation verification gate) + TIGHT latency-budget interaction at K=10 pairs/frame caveat + aerial-domain-training caveat (D-C2-1 reuse), BSD/permissive track on matcher itself (Apache-2.0 cvg/LightGlue + Apache-2.0 DISK weights). Final lead promotion to "Selected" deferred to D-C3-1 + D-C3-2 + D-C3-3 + D-C1-2 + D-C2-4 dedicated Jetson Orin Nano Super hardware MVE phase. Per the engine Component Option Breadth rule, LightGlue opens the C3 mandatory pre-screen at 1 of N candidates with the canonical adaptive-depth/adaptive-width sparse-matcher reference baseline; subsequent C3 candidates (XFeat, MASt3R, RoMa, SuperGlue, LoFTR) will be separately-cataloged in subsequent sessions.

C3 — Per-numbered-Restriction × Per-numbered-AC Sub-Matrix per Candidate (SP+LightGlue addition)

SP+LightGlue — per-numbered binding (C3-relevant lines only; cross-cutting N/A above also apply identically)

Cells share the legend defined under the MixVPR sub-matrix (C2). Where a binding is identical in both substance and evidence to the C2 candidates' rows, the SP+LightGlue row points to those rows to avoid restating; where SP+LightGlue's pinned mode produces a materially different binding (sparse-matcher 2D-2D correspondences vs C2's global descriptors, K=10-pairs-per-frame latency multiplier vs C2's single-frame compute, Magic Leap restrictive license on canonical SuperPoint weights, DISK+LightGlue Apache-2.0 mitigation, Jetson ONNX/TensorRT/FP16/FP8 export pathway via Source #73), the SP+LightGlue row carries a distinct evidence cite.

Line Binding Evidence (one-line cite)
AC-1.1 (frame-center within 50 m, ≥80% normal-flight photos) Pass (documentary on phototourism) → Verify (aerial nadir cross-domain) Source #71 paper Table 3 documentary evidence Aachen Day (0.25m,2°)=89.2 = nominally satisfies AC-1.1 50m bar at 0.25m precision tier with NetVLAD top-50 + SP+LightGlue + PnP+RANSAC end-to-end pipeline shape — directly equivalent to project's intended pipeline (C2 → C3 → C4); aerial nadir cross-domain validation required at Jetson MVE on AerialExtreMatch + Derkachi flight. D-C3-1 mitigation interaction: DISK+LightGlue may achieve higher AC-1.1 satisfaction rate per paper Table 6 +7.99 absolute AUC@5° on stereo
AC-1.2 (frame-center within 20 m, ≥50% normal-flight photos) Pass (documentary on phototourism) → Verify (aerial nadir cross-domain tighter tail) Same as AC-1.1, tighter tail; paper Table 3 documentary evidence Aachen Day (0.5m,5°)=95.4 = nominally satisfies AC-1.2 20m bar at 0.5m precision tier. SP+LightGlue-specific advantage over C2 single-stage retrieval: C3's geometric verification step (PnP+RANSAC with SP+LightGlue 2D-2D correspondences) provides the structural geometric-fine-grain accuracy filter that C2's single-stage NetVLAD/MixVPR/SALAD/EigenPlaces lacks (vs SelaVPR's local-feature MNN re-ranking which is a different mechanism). Aerial nadir AC-1.2 tail validation via AerialExtreMatch Recall@1 stratified by difficulty cell
AC-2.1b (satellite-anchor registration succeeds, AC-1.1/1.2 + AC-2.2 + AC-8.2 + AC-8.6 conditions) Pass (documentary on phototourism) → Verify (aerial nadir cross-domain) C3's contribution is the geometric verification step that determines whether retrieved tiles actually match the UAV frame — paper Table 3 documentary evidence Aachen Day (1.0m,10°)=98.5 demonstrates >98% registration success at 1m precision on phototourism; AC-2.1b registration-success rate is the canonical SP+LightGlue strength. Aerial nadir cross-domain validation required; Jetson MVE measurement on AerialExtreMatch + Derkachi flight
AC-3.3 (≥3 disconnected segments via satellite-reference re-localization) Pass (per-pair stateless) → Verify (recall under perceptual-aliasing + scene-change) SP+LightGlue's per-pair geometric verification is stateless — applies identically to first-flight + re-localization scenarios after AC-3.3 disconnections. Cross-season recall under SP+LightGlue's MegaDepth-trained weights is documented via paper Table 3 Aachen Night = 87.8/93.9/100 (slightly lower than Day) — extreme-illumination performance is robust on phototourism but unverified on aerial nadir cross-season; AerialExtreMatch + D-C2-1 required. D-C3-1 mitigation interaction: DISK+LightGlue Apache-2.0 retrain on aerial nadir corpus is the cleanest license-compliant + cross-domain pathway
AC-4.1 (latency <400 ms p95, end-to-end camera→FC) Verify — TIGHT margin at K=10 top-K retrieval pairs per frame CRITICAL latency-budget interaction: paper canonical RTX-3080 throughput is 150 FPS @ 1024 keypoints with adaptivity (= 6.7 ms per pair) / 50 FPS @ 4096 keypoints (= 20 ms per pair); Source #73 ONNX Runtime + TensorRT EP at fp16 reports 3-5× speedup over canonical PyTorch path on RTX-class GPUs (= ~2-7 ms per pair on RTX 3080 with FlashAttention-2 fused). Jetson Orin Nano Super extrapolation factor 4-6× of RTX 3080~30-60 ms per pair @ 1024 keypoints at fp16+TensorRT standard, ~15-30 ms with adaptive depth (paper §5.4 1.86× speedup on easy pairs, achievable if many of the K pairs are high-overlap). At K=10 top-K retrieval pairs per UAV frame (Fact #25 + AC-3.3 re-localization) → 300-600 ms standard / 150-300 ms with adaptivityTIGHT against AC-4.1 400 ms budget before C1+C2+C5+C8 costs added. D-C3-3 NEW Plan-phase choice required: (a) reduce K from 10 to 3-5 (cost: lower retrieval recall under perceptual aliasing); (b) reduce keypoints from 1024 to 512 (cost: lower geometric verification accuracy at AC-1.2 tail); (c) accept TIGHT margin and validate at Jetson MVE with adaptive depth; (d) parallelize matcher across multiple Jetson GPU streams (limited by single-GPU shared-memory architecture); (e) elevate to ONNX Runtime + TensorRT EP + adaptive depth via Source #73 (paper §5.4 1.86× speedup on easy pairs). D-C3-2 NEW Plan-phase choice (LightGlue-inference-runtime): PyTorch-fp16 / Torch-TensorRT / ONNX Runtime + TensorRT EP via Source #73 / pure TensorRT via trtexec + Polygraphy via Source #73 / FP8 ModelOpt-on-Jetson if Ampere FP8 emulation works
AC-4.2 (memory <8 GB shared) Pass (with Verify) — SMALLEST model footprint among C-row components evaluated SuperPoint ~1.3M params + LightGlue ~12M params at canonical 9-layer config = ~13.3M params ≈ ~27 MB total weights at fp16smallest model footprint of any C-row component evaluated so far (vs C2 EigenPlaces ~58 MB, MixVPR ~50 MB, SALAD ~172 MB, SelaVPR ~600 MB, NetVLAD ~400 MB; vs C1 Kimera-VIO ~variable / OKVIS2 ~variable / DPVO ~30 MB). Activations at 1024×1024 grayscale batch=1 ~50-100 MB at fp16 (SuperPoint dense 8-stride feature map ~8 MB + LightGlue self-attention + cross-attention layers ~30-80 MB per layer at 1024 keypoints). DISK+LightGlue alternative (D-C3-1): DISK ~1.0M params + LightGlue ~12M params = same ~27 MB total weights at fp16. No descriptor-cache pressure at C3 (vs C2 single-stage which has descriptor cache for ~400 km² operational area at AC-8.1 resolution floor) — C3 operates on UAV-frame + retrieved-tile pair on-the-fly, no pre-cached match-time state; C3 cache footprint is exactly 0 GB of the 10 GB AC-8.3 cache budget (vs C2 NetVLAD-canonical ~1.3 GB / 13%, MixVPR-2048 ~650 MB / 6.5%). Co-resident memory pressure with C1/C2/C4/C5/C6 manageable — Jetson MVE measurement
AC-8.1 (cache-interface resolution ≥0.5 m/px, ideally 0.3 m/px) Pass (with Verify) — resolution-agnostic at API level SP+LightGlue is resolution-agnostic at the algorithm level (SuperPoint accepts any input size; canonical paper evaluates at 1024×1024); cross-resolution matching at 0.5 m/px tile GSD vs nav-camera 12 cm/px GSD at 1 km AGL (project's expected ground-sampling-distance ratio ~4×) unverified — AerialExtreMatch cross-scale cells are the documentary target; same dependency as C2 candidates' rows
AC-8.6 — Scale-ratio (any UAV-frame ground footprint at deployment altitude must be retrievable) Verify — same downscale aggressiveness as canonical phototourism At 1 km AGL the nav-camera frame footprint is 470×314 m to 980×655 m (per restrictions.md); SP+LightGlue's canonical 1024×1024 grayscale input is the same downscale aggressiveness as paper canonical phototourism. SP+LightGlue-specific advantage over C2: rotary positional encoding (paper §3.4) + adaptive-depth/adaptive-width pruning (paper §3.3) make the matcher structurally robust to per-image-pair scale variation — adaptive depth halts inference when sufficient confident matches are found, regardless of input scale; rotary positional encoding generalizes to image-pair viewpoint-shifts
AC-8.6 — Scene change in active-conflict sectors Verify — partial geometric robustness to scene change Cratering / building destruction / road realignment is exactly the AerialExtreMatch "scene-change" cell + the Skoltech aerial-VPR survey (Source #38). SP+LightGlue-specific structural advantage: per-correspondence confidence threshold τ=0.1 + RANSAC inlier selection at C4 provides structural rejection mechanism for scene-change-induced false correspondences (paper §3.5 soft partial assignment matrix combining similarity + matchability scores discards uninformative regions); paper Aachen Night Recall@(1.0m,10°)=100 demonstrates extreme-illumination geometric robustness on phototourism. Aerial nadir cross-time / cross-conflict validation unverified — D-C2-1 retrain decision + AerialExtreMatch + Derkachi flight required
AC-8.6 — Compute & latency under steady-state and re-loc-trigger Verify — variable-cost adaptive-depth advantage; TIGHT margin at K=10 pairs SP+LightGlue's per-pair compute is variable (adaptive-depth + adaptive-width pruning) — paper §5.4 reports 1.86× speedup on easy pairs (high-overlap, low-viewpoint-shift) and 1.16× on hard pairs (cross-season, scene-change), 1.45× average. Steady-state UAV operation has many high-overlap pairs (consecutive UAV frames overlap at 1 km AGL with low altitude-variability) → adaptive-depth advantage is the structural counterpart to the K=10-pairs-per-frame TIGHT latency-budget interaction at AC-4.1. Re-loc-trigger workload after AC-3.3 disconnection has more cross-season + cross-time hard pairs → adaptive-depth advantage is reduced. D-C3-3 NEW Plan-phase choice interaction: Jetson MVE measurement of adaptive-depth speedup distribution on AerialExtreMatch + Derkachi flight is the documentary target
AC-NEW-2 (spoofing-promotion latency <3 s p95) Pass (latency budget very comfortable for first-pair) → Verify (multi-pair re-anchor latency) Single-pair latency budget very comfortable — SP+LightGlue per-pair at fp16+TensorRT (~30-60 ms standard / 15-30 ms adaptive on Jetson Orin Nano Super extrapolation) << 3 s budget (~50-200× under). Multi-pair re-anchor latency at K=10 pairs: 300-600 ms standard / 150-300 ms adaptive — well within 3 s budget. SP+LightGlue-specific consideration: re-anchor success requires first or first-few image pairs to produce high-inlier match after spoofing detection; paper Table 3 Aachen Day-Night documentary evidence demonstrates >87% registration success rate at 1m precision on phototourism even on Night (extreme illumination), suggesting strong re-anchor reliability if C2 retrieval delivers high-recall top-K. D-C3-1 mitigation interaction: DISK+LightGlue Apache-2.0 may have higher re-anchor reliability per paper Table 6 +7.99 AUC@5°
AC-NEW-6 (imagery freshness — never satellite_anchored on stale-tile match) Pass (mechanical) SP+LightGlue produces 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair; freshness-age decision is a downstream C5/C6 filter on the (tile-id, match-success, inlier-count) tuple. No structural interaction with freshness — C3's geometric verification is freshness-agnostic at the API level (whether the retrieved tile is fresh or stale, the geometric match either succeeds or fails); freshness-aware candidate filtering happens entirely after C3 produces match results
AC-NEW-7 (cache-poisoning safety budget — P(>30 m geo-misalign) <1%, P(>100 m) <0.1%) Pass — STRUCTURAL geometric-verification advantage over C2 single-stage retrieval CRITICAL POSITIVE finding: C3's per-correspondence confidence threshold τ=0.1 + soft partial assignment matrix combining similarity + matchability scores + downstream C4 PnP+RANSAC inlier selection provides the structural geometric-verification layer that catches mid-flight-written misaligned tiles (AC-8.4). If a poisoned mid-flight tile has a near-correct global descriptor (passing C2 single-stage retrieval) but is geometrically misaligned by >30m, C3's geometric verification is the structural mechanism that rejects the poisoned-but-misaligned tile via low-inlier-count or high-residual-error at the RANSAC step. This is the C-row's primary cache-poisoning defense layer, addressing the C2 candidates' shared "single-stage retrieval has NO structural advantage over poisoned-but-misaligned tiles" caveat. Multi-flight Monte Carlo replay validation; AC-NEW-7 budget is structurally favorable to C3-equipped pipelines vs C2-only pipelines
Restriction "Operational area: eastern/southern Ukraine" — sparse-matcher train-domain match ⚠️ Documentary gap → Verify (D-C2-1 reuse) Canonical SP+LightGlue weights are pre-trained on synthetic homographies of Oxford-Paris 1M distractors + fine-tuned on MegaDepth phototourism (368/5/24 train/val/test scenes) — same caveat as C2 candidates; D-C2-1 retrain decision applies to LightGlue identically as to C2 candidates, and interacts with D-C3-1 SuperPoint-replacement-strategy choice (DISK+LightGlue Apache-2.0 retrain on aerial nadir corpus is the cleanest license-compliant + retrain-friendly pathway). SP+LightGlue-specific consideration: paper §1 Related work [83] cites Zhang et al. 2022 ISPRS "SuperGlue generalizes well to aerial matching" — by transitive lineage (LightGlue is the strict SuperGlue successor with documented 4-10× speedup), this provides weak documentary evidence that LightGlue is similarly applicable to aerial matching, but NOT explicit aerial-nadir validation. AerialExtreMatch + Derkachi flight required
Restriction "Altitude ≤1 km AGL; terrain assumed flat (rolling steppe / agricultural)" — sparse-matcher scale band match Verify Same as AC-8.6 scale-ratio row; cross-scale matching at the project's altitude band is the AerialExtreMatch cross-scale cell
Restriction "Weather: predominantly sunny ... seasonal/visibility classes" — sparse-matcher cross-season generalization Verify (DOCUMENTARY EVIDENCE on extreme illumination from paper Table 3 Aachen Night; D-C2-1 reuse for cross-season) Cross-season matching is the dominant aerial-cross-domain failure mode per Fact #19 + SQ5; canonical SP+LightGlue weights are MegaDepth-phototourism-trained — D-C2-1 is the primary lever. SP+LightGlue-specific finding: paper Table 3 Aachen Night Recall@(1.0m,10°)=100 (vs Day=98.5, +1.5 NEGATIVE — Night actually scores higher on the loosest tier) demonstrates extreme-illumination geometric robustness; paper Table 3 Aachen Night (0.25m,2°)=87.8 (vs Day=89.2, -1.4 absolute degradation). This is the strongest extreme-illumination documentary evidence in the C-row evaluated so far. Aerial nadir cross-season + cross-conflict validation unverified — D-C2-1 retrain decision + AerialExtreMatch + Derkachi flight required
Restriction "Navigation camera (pinned): ADTi 20MP, 5472×3648" Pass (API) — same downscale as canonical phototourism SP+LightGlue consumes any 1024×1024 grayscale input; the 5472×3648 → 1024×1024 downscale is the same as paper canonical phototourism. D-C2-3 input-resolution-shape Plan-phase decision applies identically to SP+LightGlue as to all C-row components. Algorithm is resolution-agnostic at API level — resize=1024 parameter is exposed in canonical SuperPoint extractor; project may choose 1280 or 1536 at Jetson MVE time at proportional latency cost (1280×1280 = ~1.6× compute / ~50-95 ms per pair on Jetson; 1536×1536 = ~2.25× compute / ~70-135 ms per pair on Jetson)
Restriction "Satellite Imagery — resolution ≥0.5 m/px" — sparse-matcher pipeline at AC-8.1 floor Verify Same as AC-8.1; algorithm-level resolution-agnostic, matching at 0.5 m/px tile GSD vs 12 cm/px nav-camera GSD unverified
Restriction "Satellite Imagery — Cache budget: 10 GB" — sparse-matcher cache footprint Pass — NO C3 cache footprint C3 cache footprint is exactly 0 GB of the 10 GB AC-8.3 cache budget — SP+LightGlue operates on UAV-frame + retrieved-tile pair on-the-fly with no pre-cached match-time state. All C2 candidates have non-zero descriptor cache footprint (NetVLAD-canonical ~1.3 GB / 13%, MixVPR-2048 ~650 MB / 6.5%, SALAD-full-8448 ~2.7 GB / 27%, SelaVPR-global-only ~320 MB / 3.2%, EigenPlaces-2048 ~650 MB / 6.5%); C3 has no equivalent pressure on cache budget. The C3 row's only pre-cached state is the LightGlue + SuperPoint model weights themselves (~27 MB at fp16 = 0.27% of cache budget) which are loaded once at boot, not per-tile
Restriction "Companion computer: Jetson Orin Nano Super, 8 GB shared" Verify — TIGHT latency-budget interaction at K=10 pairs/frame; LOWEST C3 model footprint SP+LightGlue fp16 inference on Jetson Orin Nano Super has well-documented TensorRT export pathway via Source #73 — D-C3-2 NEW Plan-phase choice (PyTorch-fp16 / Torch-TensorRT / ONNX Runtime + TensorRT EP / pure TensorRT via trtexec + Polygraphy / FP8 ModelOpt-on-Jetson if Ampere FP8 emulation works); D-C3-3 NEW Plan-phase choice (K-pairs-per-frame budget) required to resolve AC-4.1 TIGHT margin at K=10 pairs × 30-60 ms = 300-600 ms vs 400 ms budget; D-C2-4 deferred Jetson MVE risk shared with C2 row. CRITICAL Jetson Orin Nano Super FP8 emulation gate: Source #73 documents FP8 ModelOpt workflow on Hopper/Ada/Blackwell — Jetson Orin Nano Super is Ampere; FP8 path applies only with INT8 emulation fallback (verification at Jetson MVE phase). Steady-state co-resident memory + GPU-time with C1 + C2 + C4 + C5 + C6 manageable — model footprint advantage compounds
Restriction "License posture (D-C1-1)" — sparse-matcher license-track interaction MIXED finding (Apache-2.0 matcher + Magic-Leap-restrictive-extractor-weights HARD DISQUALIFIER on canonical SP+LightGlue) — D-C3-1 NEW Plan-phase decision required POSITIVE on cvg/LightGlue itself: Source #70 LICENSE explicit copyright statement = Apache-2.0 (Copyright 2023 ETH Zurich) — permissive, BSD/permissive license track on the matcher. Same as cvg/Hierarchical-Localization (hloc) + Kimera-VIO + OKVIS2 + DPVO + pure-VO baseline; places LightGlue ITSELF on the BSD/permissive C-row license axis with materially different design point vs C2's NetVLAD/MixVPR/SelaVPR/EigenPlaces (all MIT) and C1's Kimera-VIO/OKVIS2/DPVO. NEGATIVE on canonical SuperPoint pretrained weights AND lightglue/superpoint.py inference file: Source #72 = Magic Leap "ACADEMIC OR NON-PROFIT ORGANIZATION NONCOMMERCIAL RESEARCH USE ONLY" Software License AgreementHARD DISQUALIFIER for canonical SP+LightGlue pinned mode in project's dual-use deployment context (eastern/southern Ukraine fixed-wing UAV with AC-NEW-2 spoofing-promotion path is dual-use military by every reasonable interpretation, and the project's question_decomposition.md hard disqualifier list includes "anything whose license blocks military / dual-use deployment"). D-C3-1 NEW Plan-phase decision required — mitigation paths in priority order: (a) DISK+LightGlue (Apache-2.0 throughout) — RECOMMENDED per paper Appendix A Table 6 +7.99 absolute AUC@5° on IMC 2020 stereo over SP+LightGlue (DISK+LightGlue is demonstrably technically superior to canonical SP+LightGlue on phototourism stereo); (b) ALIKED+LightGlue (BSD-3-Clause + Apache-2.0) — second-cleanest license-compliant; (c) re-train SuperPoint-class extractor under permissive license (~1-4 weeks engineering + retrain-on-aerial-nadir option preserves project-specific aerial nadir performance benefit); (d) accept Magic Leap noncommercial-research license for project's R&D phase only with explicit Plan-phase swap commitment (legally risky — internal research could still be construed as commercial preparation given the dual-use deployment intent); (e) use ALIKEDv2 + LightGlue if community implementation matures sufficiently. Recommendation: present D-C1-1 + D-C3-1 + this row to user as a structured Choose block at Plan time; DISK+LightGlue is the cleanest license-compliant + technically-superior C3 choice for the project's dual-use deployment context

Fact #48 — ALIKED+LightGlue per-mode API capability verification (canonical Shiaoming/ALIKED ResNet-class CNN with Sparse Deformable Descriptor Head + cvg/LightGlue matcher cross-domain sparse matcher D-C3-1 SECONDARY-MITIGATION on Jetson Orin Nano Super) — DOCUMENTARY PASS WITH BSD-3-CLAUSE-CANONICAL + APACHE-2.0-MATCHER + AERIAL-DOMAIN-TRAINING-CAVEAT (D-C2-1 REUSE) + ALIKED-NOT-IN-LIGHTGLUE-ONNX-EXPORT-PATHWAY HARSHER-D-C3-2-GATE + RAISES NEW D-C3-4 ALIKED-SIBLING-MODE-CHOICE; Jetson MVE pending; closes C3 mandatory pre-screen at 2/N

  • Statement: ALIKED+LightGlue (Shiaoming/ALIKED IEEE T-IM 2023; canonical implementation by Xiaoming Zhao + Xingming Wu + Weihai Chen + Peter C. Y. Chen + Qingsong Xu + Zhengguo Li, Beihang University + University of Macau + National University of Singapore + A*STAR Singapore; cvg/LightGlue port lightglue/aliked.py BSD-3-Clause-inherited from canonical, replaces custom_ops build-from-source with torchvision.ops.deform_conv2d directly per Source #74 lines 39 + 336344) is the modern competitive lightweight CNN sparse-extractor + matcher design point on the BSD/permissive license track for the C3 row — combining Sparse Deformable Descriptor Head (SDDH) for per-keypoint deformable descriptor extraction with cvg/LightGlue's adaptive-depth/adaptive-width sparse-matcher transformer. Per the per-Mode API Capability Verification rule, the project's pinned mode is the (ALIKED-N(16) extractor at 1024-largest-edge RGB input → up to 1024 keypoints with 128-D L2-normalised descriptors + per-keypoint confidence scores; canonical 0.677M-param backbone with ResNet-class encoder + 4-stage upsample aggregation + DKD differentiable keypoint detection + SDDH descriptor head with K=3 patch size + M=16 deformable sample positions) + (LightGlue matcher with features='aliked', n_layers=9, depth_confidence=0.95, width_confidence=0.99, filter_threshold=0.1, flash=True auto-detected, mp=False) → up to 1024 2D-2D correspondences with confidence scores feeding the project's downstream C4 PnP+RANSAC pose estimator. The canonical inference pipeline is identical to SP+LightGlue: extractor.extract(image_query)extractor.extract(image_target)matcher({'image0': feats_q, 'image1': feats_t})rbd()points0 = feats0['keypoints'][matches[..., 0]] and points1 = feats1['keypoints'][matches[..., 1]]. Four separately-cataloged ALIKED sibling extractor modes documented in cvg/LightGlue's lightglue/aliked.py (per Source #74): ALIKED-T(16) (Tiny: 0.192M params, 1.37 GFLOPs, 125.87 FPS RTX 2060, 64-D descriptor); ALIKED-N(16) (Normal canonical baseline: 0.677M params, 4.05 GFLOPs, 77.40 FPS RTX 2060, 128-D descriptor, M=16 SDDH samples); ALIKED-N(16rot) (Normal + rotation augmentation training: same arch as N(16), best rotation-invariance per paper §VI-C1 + Fig. 6 top, slight 3D-reconstruction degradation per paper §VI-C1); ALIKED-N(32) (Normal with M=32 SDDH samples: 0.980M params, 4.62 GFLOPs, 75.64 FPS RTX 2060, 128-D descriptor — best Aachen Day-Night relocalization variant per paper Table VII at strictest tier (0.25m,2°)/(5m,10°)=77.6/100.0). Mode-enumeration query (1/3) — context7 NOT INDEXED + WebFetch fallback PASS: context7 resolve-library-id returned no relevant matches for "ALIKED" (top-results were Supabase / Vitest / AI SDK / Mastra / Better Auth — irrelevant); per Per-Mode API Capability Verification rule item 2, fall-back to official-docs WebFetch on the canonical Shiaoming/ALIKED README + LICENSE was used (Source #74) plus canonical paper WebFetch (Source #75) plus cvg/LightGlue lightglue/aliked.py source-code inspection (transitively via Source #70). Pinned-mode runnable example query (2/3) — WebFetch PASS: Source #74 (canonical Shiaoming/ALIKED README) ships two documented inference demos (python demo_pair.py assets/st_pauls_cathedral for image-pair matching, python demo_seq.py assets/tum for sequence demo) with CLI flags --model {aliked-t16,aliked-n16,aliked-n16rot,aliked-n32} --device DEVICE --top_k TOP_K --scores_th SCORES_TH --n_limit N_LIMIT. Source #70 (cvg/LightGlue canonical README) ships the canonical pipeline with a one-line swap to use ALIKED extractor: from lightglue import LightGlue, ALIKED; from lightglue.utils import load_image, rbd; extractor = ALIKED(model_name='aliked-n16', max_num_keypoints=1024).eval().cuda(); matcher = LightGlue(features='aliked').eval().cuda(); image0 = load_image('uav_frame.jpg').cuda(); image1 = load_image('satellite_tile.jpg').cuda(); feats0 = extractor.extract(image0); feats1 = extractor.extract(image1); matches01 = matcher({'image0': feats0, 'image1': feats1}); feats0, feats1, matches01 = [rbd(x) for x in [feats0, feats1, matches01]]; matches = matches01['matches']; points0 = feats0['keypoints'][matches[..., 0]]; points1 = feats1['keypoints'][matches[..., 1]]. Source #75 paper Table VII documents ALIKED-N(32) Aachen Day-Night relocalization at (0.25m,2°)/(0.5m,5°)/(5m,10°)=77.6/88.8/100.0 with 2048 keypoints + mNN matcher — directly relevant to the project's intended pipeline shape (C2 NetVLAD-class top-K → C3 ALIKED+LightGlue → C4 PnP+RANSAC) since Aachen Day-Night exercises the same pipeline at the visual-localization task level. Disqualifier-probe query (3/3): did NOT surface any documented frame-rate floor (single-pair single-pass inference, parameter-free per-pair besides the model itself); did NOT surface any documented memory ceiling at the algorithm level beyond the standard ALIKED+LightGlue footprint (ALIKED-N(16) 0.677M params + LightGlue 12M params at canonical 9-layer config = ~12.7M params ≈ ~26 MB at fp16 total weights — comparable to SP+LightGlue's ~27 MB); did NOT surface any Jetson Orin Nano measurement directly (similarly to all C-row components — D-C3-3 deferred Jetson MVE phase will resolve); DID surface a documented ALIKED-EXPORT-ABSENCE RISK in Source #73 (fabio-sim/LightGlue-ONNX) — Source #73 README changelog explicitly lists SuperPoint (28 Jun 2023) + DISK (30 Jun 2023) extractor support but no ALIKED entry as of January 2026; Source #73 citations section cites LightGlue + SuperPoint + DISK papers only with no ALIKED reference; Source #73 example CLI commands all use superpoint as the positional extractor argument and there is no documented aliked CLI variant. Plus the canonical lightglue/aliked.py uses torchvision.ops.deform_conv2d (per Source #74 cvg/LightGlue port lines 39 + 336344) which is a known-difficult ONNX export op — historically required either ONNX opset ≥19 native DeformConv op OR a custom TensorRT plugin. Implication for D-C3-2: ALIKED+LightGlue's Jetson deployment story is materially WEAKER than DISK+LightGlue's or SP+LightGlue's; the project's options for ALIKED+LightGlue on Jetson are restricted to (a) PyTorch-fp16 only (likely 2-3× slower than DISK+LightGlue's TensorRT path, with ~40-90 ms per pair on Jetson Orin Nano Super extrapolation), (b) custom ONNX export with deform_conv plugin (significant engineering effort — community has ONNX deform_conv exports but none productized for ALIKED+LightGlue end-to-end pipeline), (c) wait for community LightGlue-ONNX ALIKED support to land (no documented timeline), (d) Torch-TensorRT partial graph compilation with deform_conv falling back to PyTorch-eager (mixed runtime — operationally complex on Jetson Orin Nano Super). Two POSITIVE structural advantages over canonical SP+LightGlue (Magic-Leap-restrictive): (i) BSD-3-Clause-canonical license-track placement (Source #74 LICENSE = BSD-3-Clause Copyright (c) 2022 Zhao Xiaoming, BSD/permissive track on extractor + Apache-2.0 on matcher = clean BSD/permissive C3 choice in project's dual-use deployment context, no Magic Leap noncommercial-research disqualifier applies); (ii) Drastic-GFLOPs-reduction advantage (paper Table IV: ALIKED-N(16) 4.05 GFLOPs + 77.40 FPS RTX 2060 vs SuperPoint 26.11 GFLOPs + 52.63 FPS = 6.4× lower GFLOPs + 1.47× higher FPS) — implication for Jetson is that PyTorch-fp16-only fallback (the mandatory D-C3-2 (a) path due to ALIKED-export-absence) may achieve adequate Jetson latency without TensorRT acceleration, partially mitigating the export-pathway gap. Three POSITIVE structural advantages over canonical DISK+LightGlue (D-C3-1 RECOMMENDED-PRIMARY-mitigation): (iii) Lower GFLOPs at competitive accuracy (paper Table IV: ALIKED-N(16) 4.05 GFLOPs vs DISK 98.97 GFLOPs = 24.4× lower GFLOPs, with MMA@3=74.43 vs DISK 77.59 = -3.16 absolute and MHA@3=77.22 vs DISK 70.56 = +6.66 absolute — DISK is more matches per pair but less geometrically-accurate, ALIKED is fewer matches but more geometrically-accurate); (iv) Better Aachen Day-Night relocalization (paper Table VII: ALIKED-N(32) at 2048 keypoints = 77.6/88.8/100.0 vs DISK = 70.4/82.7/94.9 = +7.2 / +6.1 / +5.1 absolute over DISK on the project-relevant visual-localization task); (v) Best PPC (Performance Per Cost = mAA(10°)/GFLOPs) among modern competitive sparse extractors (paper Table V: ALIKED-N(16) Stereo PPC=12.91 vs DISK 0.52 = 24.8× higher PPC — ALIKED is the most-Jetson-friendly modern competitive sparse extractor on a GFLOPs-per-accuracy basis). One CAVEAT vs canonical DISK+LightGlue: (vi) DISK has more matches per pair (paper Table V Stereo NM=2048 for DISK vs 1934.2 for ALIKED-N(16); Multiview NL=2424.8 for DISK vs 1975.4 for ALIKED-N(16) — DISK provides +19.4% more matches which is critical for bundle-adjustment-based 3D-reconstruction tasks like multi-view structure-from-motion; for the project's per-pair PnP+RANSAC at C4 with K=10 retrieved tiles, DISK's higher-#matches advantage is less critical than for SfM since the project does not run multi-view bundle adjustment in the GPS-denied flight loop). Pinned-mode sentence: "We will use ALIKED+LightGlue with ALIKED-N(16) canonical baseline extractor at 1024-largest-edge RGB input + up to 1024 keypoints with 128-D L2-normalised descriptors (or ALIKED-N(16rot) if Plan-phase D-C3-4 chooses rotation-augmented for UAV multi-heading flights, or ALIKED-N(32) if Plan-phase prioritizes best Aachen-Day-Night documentary lift, or ALIKED-T(16) if Plan-phase prioritizes Jetson PyTorch-fp16-only latency fallback) + LightGlue matcher with features='aliked', n_layers=9, depth_confidence=0.95, width_confidence=0.99, filter_threshold=0.1, flash=True at 1024×1024 RGB input per image (auto-converted from grayscale via kornia.color.grayscale_to_rgb) (canonical cvg/LightGlue ALIKED port + canonical Shiaoming/ALIKED pretrained weights config), with inputs {1× ADTi 20MP nav frame stream → bilinearly downscaled-to-largest-edge 1024 + 1× cached satellite tile per top-K retrieval result from C2} and expect outputs {up to 1024 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair, feeding C4 PnP+RANSAC with cosine confidence threshold filter at 0.95 × per-pair-max-score} on Jetson Orin Nano Super (8 GB shared, JetPack 6, ROS 2 Humble; **PyTorch fp16 baseline as DOMINANT runtime path due to ALIKED-export-absence in LightGlue-ONNX**; Torch-TensorRT partial-graph-compilation fallback if PyTorch-fp16 fails AC-4.1 latency budget at K=10 pairs/frame; ONNX Runtime + TensorRT EP path is **NOT AVAILABLE** in the cvg/LightGlue + LightGlue-ONNX ecosystem as of January 2026). D-C3-1 secondary-mitigation role per engine Component Option Breadth rule — ALIKED+LightGlue is the second-cleanest license-compliant + structurally-distinct C3 choice vs canonical SP+LightGlue's Magic-Leap-restrictive disqualifier and DISK+LightGlue's RECOMMENDED-PRIMARY mitigation; the BSD-3-Clause-canonical placement + drastic-GFLOPs-reduction advantage compensates for the Jetson-export-pathway gap if Plan-phase commits to PyTorch-fp16-only deployment."
  • Source: Source #74 (Shiaoming/ALIKED canonical README + LICENSE — BSD-3-Clause; four model variants aliked-t16/n16/n16rot/n32 with parameter counts + GFLOPs + FPS RTX 2060 + descriptor-dimensions table; cvg/LightGlue lightglue/aliked.py BSD-3-Clause inheritance + torchvision.ops.deform_conv2d substitution for canonical custom_ops/build.sh; LightGlue-ONNX ALIKED-export-absence finding from Source #73 README scope), Source #75 (canonical paper arXiv:2304.03608 / Zhao et al. IEEE T-IM 2023 — §III architecture [4 ConvBlock/ResBlock encoder stages with deformable conv in last 2 blocks, SMH score head, DKD keypoint detection, SDDH sparse deformable descriptor head with M deformable sample positions per keypoint] + §IV SDDH details + §V sparse NRE loss relaxation + §VI experiments [HPatches Table IV, IMW-test Table V, FM-Bench Table VI, Aachen Day-Night Table VII, RTX 2060 timing, rotation invariance §VI-C1, scale invariance §VI-C2] + §VI-A implementation details [MegaDepth + R2D2 homographic training, Adam optimizer, 800×800 training resolution, batch size 2, gradient accumulation × 6, 100K training steps]), Source #70 (cvg/LightGlue canonical README — LightGlue(features='aliked') mode wiring with input_dim=128; from lightglue import ALIKED extractor class import; transitive citation for the cvg/LightGlue port file lightglue/aliked.py), Source #73 (fabio-sim/LightGlue-ONNX companion — ALIKED export absence finding: README changelog lists SuperPoint + DISK only; citations cite LightGlue + SuperPoint + DISK papers only; example CLI uses superpoint positional argument only — implies ALIKED end-to-end ONNX/TensorRT pathway is NOT productized as of January 2026)
  • Phase: Phase 2
  • Target Audience: System architects + C3 implementer + C4 (PnP+RANSAC) implementer + C7 (Jetson runtime) implementer + Step-7.5 reviewer + license-posture decision-maker (D-C1-1 + D-C3-1 secondary-mitigation choice + D-C3-4 NEW ALIKED-sibling-mode-choice) + Jetson-deployment decision-maker (D-C3-2 with hard PyTorch-fp16-only restriction for ALIKED+LightGlue)
  • Confidence: for mode-enumeration (four canonical extractor sibling modes + LightGlue matcher integration), runnable-example (canonical Shiaoming/ALIKED demo CLIs + cvg/LightGlue port one-liner), parameter-count (ALIKED-N(16) 0.677M params + LightGlue 12M params = ~12.7M total ≈ ~26 MB at fp16), license (BSD-3-Clause canonical + Apache-2.0 matcher = clean BSD/permissive throughout); for documentary RTX-2060 throughput benchmarks (ALIKED-N(16) 77.40 FPS @ 640×480 + 1k keypoints), HPatches/IMW-test/FM-Bench/Aachen Day-Night documentary Recall@K + AUC + #matches across 7 datasets (paper Tables IV-VII); for Aachen Day-Night documentary lift over SuperPoint (paper Table VII ALIKED-N(32) +7.1/+9.2/+12.2 absolute at 2048 keypoints / strictest tier — by transitive lineage with Source #71 LightGlue paper Table 3 NetVLAD top-50 + SP+LightGlue + PnP+RANSAC pipeline at Day (0.25m,2°)=89.2, the expected Aachen Day-Night ALIKED+LightGlue accuracy should approach or exceed SP+LightGlue but NO direct documentary measurement of ALIKED+LightGlue-on-Aachen exists in canonical papers — Plan-phase community-evaluation cite or Jetson MVE direct measurement required); for paper Table V PPC (Performance Per Cost) advantage: ALIKED-N(16) Stereo PPC=12.91 vs DISK 0.52 = 24.8× higher PPC; ⚠️ for Jetson Orin Nano Super deployment latency / memory / accuracy (no documentary measurement — Jetson MVE will resolve via D-C3-3); ⚠️ for Jetson Orin Nano Super ALIKED-export-pathwaySource #73 LightGlue-ONNX does NOT ship documented ALIKED end-to-end pipeline as of January 2026 (changelog + citations + CLI examples support SuperPoint + DISK only; no ALIKED entry); ALIKED's torchvision.ops.deform_conv2d is a known-difficult ONNX export op (deform_conv historically requires ONNX opset ≥19 native or custom TensorRT plugin); Implication: ALIKED+LightGlue Jetson runtime path is restricted to PyTorch-fp16-only or custom-ONNX-engineering vs DISK+LightGlue's well-documented LightGlue-ONNX pathwayHARSHER D-C3-2 gate for ALIKED+LightGlue than for DISK+LightGlue; for canonical-checkpoint aerial-domain fitness (canonical training on MegaDepth phototourism + R2D2 Oxford-Paris/Aachen synthetic homographies — NOT aerial nadir; same caveat as SP+LightGlue + DISK+LightGlue + C2 candidates, D-C2-1 reuse); for BSD-3-Clause canonical placement + Apache-2.0 matcher = clean BSD/permissive C3 choice (eligible on every D-C1-1 license-posture path; no Magic Leap noncommercial-research disqualifier applies)
  • Related Dimension: SQ3+SQ4 / C3 modern competitive lightweight CNN sparse-extractor + matcher candidate (D-C3-1 secondary-mitigation role) — per-mode API capability verification gate
  • Fit Impact: DOCUMENTARY PASS for the per-mode API capability verification gate — ALIKED+LightGlue has a documented runnable per-mode example with the project's pinned configuration (canonical Shiaoming/ALIKED + cvg/LightGlue ALIKED port + canonical paper algorithmic specification), four documented ALIKED extractor sibling modes (T(16) tiny 64-D, N(16) normal canonical 128-D, N(16rot) rotation-augmented 128-D, N(32) higher-SDDH-sample-count 128-D), and no API-level disqualifier. Three POSITIVE structural findings vs all prior C-row components: (i) Drastic GFLOPs reduction at competitive accuracy — ALIKED-N(16) 4.05 GFLOPs vs SuperPoint 26.11 GFLOPs (-6.4×) vs DISK 98.97 GFLOPs (-24.4×); ALIKED is the most-Jetson-friendly modern competitive sparse extractor on a GFLOPs-per-accuracy basis with paper Table V PPC=12.91 (24.8× higher than DISK's 0.52). (ii) BSD-3-Clause canonical placement — full BSD/permissive license-track placement on extractor + matcher, second-cleanest license-compliant alternative to D-C3-1 RECOMMENDED-PRIMARY DISK+LightGlue, eligible on every D-C1-1 license-posture path. (iii) Best-in-class Aachen Day-Night relocalization (paper Table VII at 2048 keypoints / 0.25m,2° tier): ALIKED-N(32)=77.6 vs SuperPoint=69.4 = +8.2 absolute lift on the project-relevant visual-localization task (Aachen Day-Night is the canonical evaluation pipeline for project's intended C2→C3→C4 architecture); transitive lineage with Source #71 LightGlue paper Table 3 suggests ALIKED+LightGlue may achieve ~Day(0.25m,2°)=92-95% on Aachen, marginally beating SP+LightGlue's 89.2. HOWEVER, two NEGATIVE structural findings vs DISK+LightGlue (D-C3-1 RECOMMENDED-PRIMARY mitigation): (iv) NO LightGlue-ONNX export pathway as of January 2026 — Source #73 explicitly supports SuperPoint + DISK only; ALIKED-export-absence + ALIKED's torchvision.ops.deform_conv2d ONNX-export-difficulty creates a HARSHER D-C3-2 gate for ALIKED+LightGlue than for DISK+LightGlue. (v) DISK+LightGlue has +7.99 absolute AUC@5° on IMC 2020 stereo per Source #71 paper Appendix A Table 6 (DISK+LightGlue 67.02 vs SP+LightGlue 59.03 — DISK+LightGlue is the strongest-documentary-stereo-AUC C3 candidate); ALIKED+LightGlue is NOT directly measured in Source #71 paper Tables 6/7 (cvg/LightGlue ALIKED port + ALIKED-LightGlue weights were added post-paper); DISK+LightGlue has stronger direct documentary evidence for stereo phototourism while ALIKED has stronger direct documentary evidence for visual-relocalization (Aachen Day-Night Table VII). NEW Plan-phase decision raised by ALIKED+LightGlue closure (will be tagged D-C3-4): D-C3-4 (NEW) ALIKED-sibling-mode-choice (aliked-t16 64-D Jetson-friendliest / aliked-n16 128-D canonical baseline / aliked-n16rot 128-D rotation-augmented / aliked-n32 128-D higher-SDDH-sample-count Aachen-Day-Night-best) — Plan-phase decision; for the project's pinned UAV multi-heading flights at 1 km AGL with Jetson PyTorch-fp16-only deployment, the strongest sibling-mode candidate is ALIKED-N(16rot) (rotation augmentation aligns with multi-heading aerial flights; 4.05 GFLOPs leaves headroom for K=10 pairs/frame; same 128-D descriptor as canonical N(16)), with ALIKED-T(16) as the latency-fallback (1.37 GFLOPs / 125.87 FPS RTX 2060 at the cost of 64-D descriptor accuracy reduction) and ALIKED-N(32) as the accuracy-prioritization choice (4.62 GFLOPs / 75.64 FPS RTX 2060 at the cost of higher Jetson latency). REUSE of D-C2-1 (aerial-domain training): applies identically to ALIKED+LightGlue as to all C-row components; canonical training on MegaDepth phototourism + R2D2 homography is NOT aerial nadir; D-C2-1 retrain decision interacts with D-C3-1 extractor choice — ALIKED+LightGlue is moderately retrain-friendly (paper §V sparse NRE loss relaxation reduced GPU memory by ~3.5× vs DISK's RL training; canonical training takes 100K steps over MegaDepth + R2D2 homographic at 800×800 batch 2 with gradient accumulation × 6 — feasible on a single RTX 3090 in ~24 hours). C3 mandatory pre-screen status: ALIKED+LightGlue closes the C3 mandatory pre-screen at 2 of N candidates (SP+LightGlue at 1/N from prior session + ALIKED+LightGlue at 2/N this session). The deferred Jetson Orin Nano Super hardware MVE phase still gates final accuracy/latency/memory measurement (D-C1-2 + D-C3-3) — ALIKED+LightGlue's measurement role on the Jetson is to establish the modern competitive lightweight CNN sparse-matcher reference baseline on the BSD/permissive license track with PyTorch-fp16-only deployment, against which D-C3-1 RECOMMENDED-PRIMARY DISK+LightGlue (TensorRT-equipped) and other C3 candidates (XFeat, SuperGlue+SuperPoint, etc.) are scored on the project's specific operating context (aerial nadir, 1 km AGL, eastern/southern Ukraine cross-season, AC-4.1 + AC-4.2 + AC-8.3 budgets). License: BSD-3-Clause for canonical Shiaoming/ALIKED (per Source #74 LICENSE) + Apache-2.0 for cvg/LightGlue matcher (per Source #70 LICENSE) — clean BSD/permissive license track throughout; no Magic Leap noncommercial-research disqualifier applies (vs canonical SP+LightGlue's Magic Leap restrictive license disqualifier).

C3 — Per-Mode API Capability Verification (engine Step 2 — ALIKED+LightGlue session entry, 2026-05-08)

MVE — ALIKED+LightGlue with ALIKED-N(16) canonical extractor + 1024 keypoints + 128-D descriptors @ 1024-largest-edge RGB → up to 1024 2D-2D correspondences (canonical D-C3-1 SECONDARY-MITIGATION variant; ALIKED-T(16) 64-D / ALIKED-N(16rot) 128-D rotation-augmented / ALIKED-N(32) 128-D higher-SDDH-sample-count documented as separately-cataloged sibling modes; D-C3-4 NEW Plan-phase choice required)

  • Source: Source #74 (Shiaoming/ALIKED canonical README + LICENSE — python demo_pair.py assets/st_pauls_cathedral --model aliked-n16 --top_k 1024 for canonical pretrained inference, four pretrained checkpoints aliked-t16.pth / aliked-n16.pth / aliked-n16rot.pth / aliked-n32.pth distributed in-tree under models/, BSD-3-Clause License; cvg/LightGlue lightglue/aliked.py BSD-3-Clause inheritance with torchvision.ops.deform_conv2d substitution for canonical custom_ops/build.sh), accessed 2026-05-08; Source #75 (canonical paper arXiv:2304.03608 / Zhao et al. IEEE T-IM 2023 — §III architecture [4-stage feature encoder with deformable conv in blocks 3+4, SMH score head with 1×1+3×3+3×3+3×3 conv layers, DKD differentiable keypoint detection inherited from ALIKE, SDDH sparse deformable descriptor head with K=3 patch + M deformable sample positions per keypoint via Eq. 45] + §IV SDDH efficiency analysis [theoretical complexity 2NM(K²C+2M) + 4NMC + 2NMC²; Table III Running time K=5/N=5000 SDDH 1.06ms vs DMH 50.79ms = 47.9× speedup] + §V sparse NRE loss relaxation [reduces GPU memory ~3.5× vs DISK dense NRE]+ §VI-A implementation details [MegaDepth perspective + R2D2 Oxford-Paris/Aachen homographic training datasets, Adam optimizer betas 0.9/0.999, top-400 detected + 400 random keypoints with NMS, 800×800 training resolution, batch size 2, gradient accumulation × 6 batches, 100K training steps, RTX 2060 evaluation hardware]); Source #70 (cvg/LightGlue canonical README — from lightglue import LightGlue, ALIKED; matcher = LightGlue(features='aliked').eval().cuda(); transitive citation for lightglue/aliked.py BSD-3-Clause + torchvision.ops.deform_conv2d substitution); Source #73 (fabio-sim/LightGlue-ONNX companion — ALIKED export absence finding: changelog lists SuperPoint + DISK only; CLI examples use superpoint positional only; citations cite LightGlue + SuperPoint + DISK papers only)
  • Inputs in the example: Two arbitrary RGB or grayscale images at any (independent) resolutions; canonical demo uses assets/st_pauls_cathedral image pair at native resolution; load_image returns torch.Tensor[3, H, W] normalized to [0, 1]; ALIKED extractor requires RGB input — auto-converts grayscale via kornia.color.grayscale_to_rgb per lightglue/aliked.py lines 749750; ALIKED extractor cropped output: feats: {keypoints: torch.Tensor[B, N, 2], descriptors: torch.Tensor[B, N, 128], keypoint_scores: torch.Tensor[B, N]} where N ≤ max_num_keypoints (canonical default -1 for threshold-based detection; project pinned to 1024); LightGlue matcher input: dict with image0 and image1 keys mapping to per-image ALIKED output dicts; output: {matches0: torch.Tensor[B, N], matches1: torch.Tensor[B, N], matching_scores0: torch.Tensor[B, N], matching_scores1: torch.Tensor[B, N], matches: List[torch.Tensor[K, 2]], scores: List[torch.Tensor[K]], stop: int} where K is the number of correspondences after τ=0.1 filtering; rbd(x) removes batch dim
  • Outputs in the example: Up to 1024 2D-2D correspondences with per-correspondence confidence score s_k ∈ [τ=0.1, 1.0]; canonical paper Table IV reports HPatches MMA@3=74.43% / MHA@3=77.22% with 1k keypoints + mNN matcher (LightGlue would lift these by 5-10 absolute per Source #71 LightGlue paper documentary evidence); canonical paper Table V reports IMW-test Stereo mAA(5°)=39.53 / mAA(10°)=52.28 with 2048 keypoints + ratio-test matcher; canonical paper Table VII reports Aachen Day-Night (0.25m,2°)/(0.5m,5°)/(5m,10°)=80.6/87.8/99.0 with ALIKED-N(16) + 2048 keypoints and 77.6/88.8/100.0 with ALIKED-N(32) + 2048 keypoints; canonical RTX-2060 throughput (paper Table IV): ALIKED-N(16) 77.40 FPS @ 640×480 + 1k keypoints = 12.92 ms per pair extraction-only (LightGlue matching adds ~5-10 ms additional with adaptive depth on RTX 2060); ALIKED-T(16) 125.87 FPS (= 7.94 ms per pair); ALIKED-N(32) 75.64 FPS (= 13.22 ms per pair)
  • Project inputs: 1× ADTi 20MP nav frame stream (5472×3648, target 3 fps) → bilinearly downscaled-to-largest-edge 1024 → grayscale-converted (or RGB-preserved per project's nav-camera config) → fp16 batch on Jetson Orin Nano Super; per-UAV-frame K=10 top-K retrieved satellite tiles from C2 → bilinearly downscaled-to-largest-edge 1024 → grayscale-or-RGB → fp16 batch on Jetson Orin Nano Super; total per-frame compute = K=10 image pairs (UAV-frame, satellite-tile)
  • Project outputs required: Up to 1024 2D-2D correspondences per (UAV-frame, satellite-tile) image pair with confidence scores; cosine-confidence-threshold filter at 0.95 × per-pair-max-score to retain only the most confident correspondences; feeds C4 PnP+RANSAC pose estimator with 4-point minimum; satisfies AC-1.1 frame-center-within-50m pose accuracy requirement when pairing with high-recall C2 retrieval (paper Table VII Aachen Day documentary evidence ALIKED-N(32) at (0.25m,2°)/(0.5m,5°)=77.6/88.8 = nominally satisfies AC-1.1 50m bar at 0.5m precision tier with 2048 keypoints + mNN matcher; LightGlue lifts this further); satisfies AC-1.2 frame-center-within-20m at tighter tolerance (paper Table VII ALIKED-N(32) at (5m,10°)=100.0); satisfies AC-2.1b satellite-anchor-registration-succeeds gate when C3 image pair achieves >30 inliers after RANSAC; MORE-FAVORABLE latency-budget interaction than SP+LightGlue: ALIKED-N(16) 4.05 GFLOPs vs SP 26.11 GFLOPs = 6.4× lower extraction GFLOPs → at K=10 pairs × extraction (~10 ms PyTorch-fp16 Jetson extrapolation) + matching (~30-50 ms Jetson PyTorch-fp16 with adaptive depth) = ~400-600 ms per UAV frame on PyTorch-fp16-only path (no TensorRT acceleration available due to ALIKED-export-absence); TIGHT against AC-4.1 400 ms budget but the GFLOPs advantage of ALIKED partially offsets the export-pathway disadvantage; satisfies AC-4.2 memory budget with comfortable margin (~26 MB total weights at fp16, comparable to SP+LightGlue)
  • Match assessment: exact mode match for (ALIKED-N(16) extractor at 1024-largest-edge RGB input, 1024 max keypoints, 128-D descriptors, LightGlue matcher with features='aliked', n_layers=9, depth_confidence=0.95, width_confidence=0.99, filter_threshold=0.1, flash=True, up to 1024 2D-2D correspondences output with confidence scores); training+evaluation+canonical-pretrained-distribution CLIs exist in Shiaoming/ALIKED (Source #74) AND in cvg/LightGlue ALIKED port (Source #70); four ALIKED sibling modes documented (T(16) 64-D / N(16) 128-D canonical / N(16rot) 128-D rotation-augmented / N(32) 128-D higher-SDDH-sample-count); companion cvg/Hierarchical-Localization (hloc) ships canonical NetVLAD top-50 → SuperPoint+LightGlue → PnP+RANSAC pipeline (transitive applicability to ALIKED+LightGlue via features='aliked' swap); paper Table VII Aachen Day-Night documentary lift over SuperPoint at strictest tier (+8.2 absolute on (0.25m,2°) for ALIKED-N(32) over SuperPoint at 2048 keypoints); ⚠️ partial input domain (canonical training on MegaDepth perspective + R2D2 Oxford-Paris/Aachen homographic — NOT aerial nadir; same caveat as SP+LightGlue + DISK+LightGlue; D-C2-1 retrain decision applies); HARSHER D-C3-2 Jetson export-pathway gate: Source #73 (fabio-sim/LightGlue-ONNX) does NOT ship documented ALIKED end-to-end ONNX/TensorRT pipeline as of January 2026 — changelog + citations + CLI examples support SuperPoint + DISK only; ALIKED's torchvision.ops.deform_conv2d is a known-difficult ONNX export op; PyTorch-fp16-only runtime path is the dominant Jetson option for ALIKED+LightGlue vs DISK+LightGlue's well-documented TensorRT pathway; ⚠️ for Jetson Orin Nano Super latency / memory / accuracy on PyTorch-fp16 path (no documentary measurement — Jetson MVE will resolve via D-C3-3); for BSD-3-Clause + Apache-2.0 license-track placement = clean BSD/permissive throughout, second-cleanest license-compliant after DISK+LightGlue, NO Magic Leap noncommercial-research disqualifier
  • If ⚠️ or : docs do not explicitly disqualify the algorithmic mode at the API or capability level. The (extractor, matcher, keypoint count, descriptor dimension, input size, normalisation, output shape) tuple is documented and runnable directly via Shiaoming/ALIKED canonical CLI OR via cvg/LightGlue ALIKED port. HOWEVER, ALIKED-export-absence in LightGlue-ONNX (Source #73) creates a HARSHER D-C3-2 Jetson deployment gate vs DISK+LightGlue: project's Jetson runtime path for ALIKED+LightGlue is restricted to (a) PyTorch-fp16 only (likely 2-3× slower than DISK+LightGlue's TensorRT path; project must validate ~400-600 ms per UAV frame at K=10 pairs PyTorch-fp16 fits AC-4.1 400 ms budget at Jetson MVE phase), (b) custom ONNX export with deform_conv plugin (significant engineering effort), (c) wait for community LightGlue-ONNX ALIKED support to land (no documented timeline), (d) Torch-TensorRT partial graph compilation with deform_conv falling back to PyTorch-eager (mixed runtime — operationally complex). → Status: Documentary lead with BSD-3-Clause-canonical license track + ALIKED-export-absence-in-LightGlue-ONNX HARSHER-D-C3-2-gate caveat + drastic-GFLOPs-reduction advantage + best-Aachen-Day-Night-relocalization-on-canonical-paper advantage + aerial-domain-training caveat (D-C2-1 reuse) + D-C3-4 NEW ALIKED-sibling-mode-choice Plan-phase decision, BSD/permissive track throughout. Final lead promotion to "Selected" or "Conditional secondary-mitigation" deferred to D-C3-1 + D-C3-2 + D-C3-3 + D-C3-4 + D-C1-2 + D-C2-4 dedicated Jetson Orin Nano Super hardware MVE phase. Per the engine Component Option Breadth rule, ALIKED+LightGlue closes the C3 mandatory pre-screen at 2 of N candidates (SP+LightGlue + ALIKED+LightGlue) with the canonical lightweight-CNN-sparse-extractor + matcher reference baseline on the BSD/permissive license track; subsequent C3 candidates (DISK+LightGlue full per-mode entry, XFeat, SuperGlue+SuperPoint mandatory simple-baseline) will be separately-cataloged in subsequent sessions.

C3 — Per-numbered-Restriction × Per-numbered-AC Sub-Matrix per Candidate (ALIKED+LightGlue addition)

ALIKED+LightGlue — per-numbered binding (C3-relevant lines only; cross-cutting N/A above also apply identically)

Cells share the legend defined under the MixVPR sub-matrix (C2). Where a binding is identical in both substance and evidence to the SP+LightGlue row, the ALIKED+LightGlue row points to that row to avoid restating; where ALIKED+LightGlue's pinned mode produces a materially different binding (BSD-3-Clause-canonical license throughout vs SP+LightGlue's Magic-Leap-restrictive disqualifier on extractor weights, ALIKED-export-absence-in-LightGlue-ONNX HARSHER-D-C3-2-gate vs SP+LightGlue's well-documented TensorRT pathway, drastic-GFLOPs-reduction advantage vs SP+LightGlue's higher GFLOPs, best-Aachen-Day-Night-relocalization-canonical-evidence on ALIKED-N(32)), the ALIKED+LightGlue row carries a distinct evidence cite.

Line Binding Evidence (one-line cite)
AC-1.1 (frame-center within 50 m, ≥80% normal-flight photos) Pass (documentary on Aachen Day-Night Table VII) → Verify (aerial nadir cross-domain) Source #75 paper Table VII documents ALIKED-N(32) Aachen Day-Night (0.25m,2°)=77.6 with mNN matcher + 2048 keypoints — +8.2 absolute lift over SuperPoint=69.4 at strictest tier; transitive lineage with Source #71 LightGlue paper Table 3 NetVLAD top-50 + SP+LightGlue + PnP+RANSAC pipeline at Day(0.25m,2°)=89.2 suggests expected ALIKED+LightGlue accuracy on Aachen Day approaches or exceeds SP+LightGlue's 89.2 at the project's intended pipeline shape. NO direct ALIKED+LightGlue Aachen measurement exists in canonical papers (cvg/LightGlue paper Table 3 was published before the cvg/LightGlue ALIKED port) — Plan-phase community-evaluation cite or Jetson MVE direct measurement required. Aerial nadir cross-domain validation required at Jetson MVE on AerialExtreMatch + Derkachi flight. D-C2-1 reuse: canonical training on MegaDepth + R2D2 homographic is NOT aerial nadir; aerial-domain retrain on AerialVL is moderately retrain-friendly per paper §V sparse NRE loss memory advantage
AC-1.2 (frame-center within 20 m, ≥50% normal-flight photos) Pass (documentary on Aachen Day-Night Table VII) → Verify (aerial nadir cross-domain tighter tail) Same as AC-1.1, tighter tail; paper Table VII documentary evidence ALIKED-N(32) Aachen Day (0.5m,5°)=88.8 with mNN — nominally satisfies AC-1.2 20m bar at 0.5m precision tier (LightGlue would lift further). ALIKED+LightGlue-specific advantage over SP+LightGlue: paper Table IV documents ALIKED-N(16) MHA@3=77.22 vs SuperPoint MHA@3=70.19 = +7.03 absolute on HPatches homography accuracy — ALIKED's deformable descriptor extraction provides better geometric verification accuracy than SuperPoint at the AC-1.2 tail. Aerial nadir AC-1.2 tail validation via AerialExtreMatch Recall@1 stratified by difficulty cell
AC-2.1b (satellite-anchor registration succeeds, AC-1.1/1.2 + AC-2.2 + AC-8.2 + AC-8.6 conditions) Pass (documentary on Aachen) → Verify (aerial nadir cross-domain) C3's contribution is the geometric verification step; paper Table VII documentary evidence ALIKED-N(32) Aachen Day (5m,10°)=100.0 with mNN demonstrates >100% registration success at 5m precision on phototourism (vs SuperPoint=87.8 = +12.2 absolute lift); AC-2.1b registration-success rate is ALIKED+LightGlue's STRONGEST documentary signal. Aerial nadir cross-domain validation required; Jetson MVE measurement on AerialExtreMatch + Derkachi flight
AC-3.3 (≥3 disconnected segments via satellite-reference re-localization) Pass (per-pair stateless) → Verify (recall under perceptual-aliasing + scene-change) ALIKED+LightGlue's per-pair geometric verification is stateless — applies identically to first-flight + re-localization scenarios. ALIKED-N(16rot) sibling mode (D-C3-4 Plan-phase choice) provides best rotation invariance (paper §VI-C1 + Fig. 6 top) — directly applicable to UAV multi-heading re-localization. Cross-season recall under ALIKED's MegaDepth+R2D2-trained weights is unverified on aerial nadir; AerialExtreMatch + D-C2-1 required
AC-4.1 (latency <400 ms p95, end-to-end camera→FC) Verify — TIGHT margin at K=10 pairs/frame on PyTorch-fp16-only path CRITICAL latency-budget interaction: paper Table IV canonical RTX-2060 throughput for ALIKED-N(16) 77.40 FPS @ 640×480 + 1k keypoints = 12.92 ms per pair extraction-only; LightGlue matching with adaptive depth adds ~5-10 ms additional on RTX 2060 (per Source #71 paper §5.4 1.86× speedup on easy pairs); total ~18-23 ms per pair on RTX 2060 PyTorch-fp16. CRITICAL Jetson Orin Nano Super extrapolation: Jetson Orin Nano Super has ~1/4× to 1/6× of RTX 2060 throughput → ~70-140 ms per pair @ 1024 keypoints on PyTorch-fp16-only Jetson standard / ~40-90 ms per pair with adaptive depth. At K=10 top-K retrieval pairs per UAV frame = 400-1400 ms per UAV frame standard / 400-900 ms with adaptivityAC-4.1 400 ms budget MOSTLY EXCEEDED on PyTorch-fp16-only path. HARSHER D-C3-2 Jetson export-pathway gate: Source #73 LightGlue-ONNX does NOT ship documented ALIKED end-to-end ONNX/TensorRT pathway as of January 2026 — ALIKED's torchvision.ops.deform_conv2d is a known-difficult ONNX export op; project's Jetson runtime path for ALIKED+LightGlue is restricted to (a) PyTorch-fp16 only, (b) custom ONNX export with deform_conv plugin (significant engineering effort), (c) Torch-TensorRT partial graph compilation with deform_conv falling back to PyTorch-eager (mixed runtime). D-C3-2 NEW Plan-phase choice for ALIKED+LightGlue: option (a) PyTorch-fp16 only is the DOMINANT runtime path vs DISK+LightGlue's well-documented TensorRT acceleration. D-C3-3 NEW Plan-phase choice (K-pairs-per-frame budget): more critical for ALIKED+LightGlue than for SP+LightGlue or DISK+LightGlue due to PyTorch-fp16-only restriction; likely requires K reduction from 10 to 3-5 OR ALIKED-T(16) 64-D sibling mode (1.37 GFLOPs / 125.87 FPS RTX 2060) for AC-4.1 satisfaction. D-C3-4 NEW Plan-phase choice (ALIKED-sibling-mode): ALIKED-T(16) prioritizes Jetson PyTorch-fp16-only latency at the cost of 64-D descriptor accuracy reduction; ALIKED-N(16) is the canonical baseline; ALIKED-N(16rot) prioritizes UAV multi-heading rotation invariance; ALIKED-N(32) prioritizes Aachen-Day-Night-best documentary lift at the cost of ~10% higher Jetson latency vs N(16)
AC-4.2 (memory <8 GB shared) Pass (with Verify) — comparable model footprint to SP+LightGlue ALIKED-N(16) 0.677M params + LightGlue 12M params at canonical 9-layer config = ~12.7M params ≈ ~26 MB total weights at fp16 (comparable to SP+LightGlue's ~27 MB; smaller than DISK+LightGlue's ~14 MB extractor + 12M matcher = ~28 MB total). Activations at 1024×1024 RGB batch=1 ~50-150 MB at fp16 (ALIKED dense feature map at multi-scale + SDDH per-keypoint sampling overhead + LightGlue self-attention + cross-attention layers ~30-80 MB per layer at 1024 keypoints). No descriptor-cache pressure at C3 (vs C2 single-stage which has descriptor cache); C3 cache footprint is exactly 0 GB of the 10 GB AC-8.3 cache budget (same as SP+LightGlue + DISK+LightGlue). Co-resident memory pressure with C1/C2/C4/C5/C6 manageable — Jetson MVE measurement
AC-8.1 (cache-interface resolution ≥0.5 m/px, ideally 0.3 m/px) Pass (with Verify) — resolution-agnostic at API level ALIKED is resolution-agnostic at the algorithm level (deformable conv accepts any input size; canonical paper evaluates at 640×480 + 800×800 + Aachen native resolutions); cross-resolution matching at 0.5 m/px tile GSD vs nav-camera 12 cm/px GSD at 1 km AGL unverified — AerialExtreMatch cross-scale cells are the documentary target
AC-8.6 — Scale-ratio (any UAV-frame ground footprint at deployment altitude must be retrievable) Verify — best documentary scale invariance among single-scale C3 candidates At 1 km AGL the nav-camera frame footprint is 470×314 m to 980×655 m; ALIKED's canonical 1024-largest-edge RGB input is the same as SP+LightGlue + DISK+LightGlue. ALIKED+LightGlue-specific advantage: paper §VI-C2 + Fig. 6 bottom documents ALIKED-N(16) has best matching accuracy among single-scale matching methods at all scale-difference levels; multi-scale variant ALIKED-N(16, MS) handles up to 8× scale difference (vs R2D2(MS) which degrades at 4×). For aerial nadir UAV frames vs satellite tiles where scale variation is bounded by AGL altitude × satellite tile GSD ratio (~4× at 1 km AGL × 0.5 m/px), ALIKED's scale-invariance advantage is materially relevant
AC-8.6 — Scene change in active-conflict sectors Verify — partial geometric robustness via deformable descriptor Cratering / building destruction / road realignment is exactly the AerialExtreMatch "scene-change" cell. ALIKED+LightGlue-specific structural advantage over SP+LightGlue: paper Eq. 45 deformable descriptor extraction at sparse keypoints provides per-keypoint geometric-invariance modeling that adapts to local scene structure changes; paper Fig. 7 visualizes deformable focus areas adapting to homography + perspective image pairs. ALIKED+LightGlue's structural defense against scene-change is theoretically stronger than SP+LightGlue's fixed-grid descriptor extraction, but unverified on aerial-conflict scene-change. AerialExtreMatch + D-C2-1 retrain decision required
AC-8.6 — Compute & latency under steady-state and re-loc-trigger Verify — TIGHT margin under steady-state on PyTorch-fp16-only path ALIKED+LightGlue's per-pair compute is variable (LightGlue adaptive-depth + adaptive-width pruning) — same advantage as SP+LightGlue + DISK+LightGlue (paper §5.4 1.86× speedup on easy pairs / 1.16× on hard / 1.45× average). HOWEVER, ALIKED+LightGlue lacks the ONNX/TensorRT acceleration multiplier — Jetson PyTorch-fp16-only path is ~2-3× slower than DISK+LightGlue's TensorRT EP path. Steady-state UAV operation has many high-overlap pairs (consecutive UAV frames overlap at 1 km AGL with low altitude-variability) → adaptive-depth advantage compounds with PyTorch-fp16-only restriction. Re-loc-trigger workload after AC-3.3 disconnection has more cross-season + cross-time hard pairs → adaptive-depth advantage is reduced; combined with PyTorch-fp16-only restriction → MORE-TIGHT D-C3-3 K-pairs-per-frame budget gate for ALIKED+LightGlue than for DISK+LightGlue or SP+LightGlue
AC-NEW-2 (spoofing-promotion latency <3 s p95) Pass (single-pair latency comfortable) → Verify (multi-pair re-anchor latency on PyTorch-fp16-only path) Single-pair latency budget very comfortable — ALIKED+LightGlue per-pair at PyTorch-fp16 (~70-140 ms standard / 40-90 ms adaptive on Jetson Orin Nano Super extrapolation) << 3 s budget (~20-75× under). Multi-pair re-anchor latency at K=10 pairs: 400-1400 ms standard / 400-900 ms adaptive — comfortably within 3 s budget. ALIKED+LightGlue-specific consideration: paper Table VII Aachen Day-Night documentary evidence at strictest tier (0.25m,2°)=77.6 (ALIKED-N(32)+mNN; LightGlue would lift) demonstrates strong re-anchor reliability if C2 retrieval delivers high-recall top-K; deformable descriptor robustness to viewpoint variation aligns with UAV-may-have-flown-different-heading-by-spoofing-detection-time scenario
AC-NEW-6 (imagery freshness — never satellite_anchored on stale-tile match) Pass (mechanical) ALIKED+LightGlue produces 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair; freshness-age decision is a downstream C5/C6 filter on the (tile-id, match-success, inlier-count) tuple. No structural interaction with freshness — same as SP+LightGlue + DISK+LightGlue rows
AC-NEW-7 (cache-poisoning safety budget — P(>30 m geo-misalign) <1%, P(>100 m) <0.1%) Pass — STRUCTURAL geometric-verification advantage over C2 single-stage retrieval Same as SP+LightGlue row — C3's per-correspondence confidence threshold τ=0.1 + soft partial assignment matrix + downstream C4 PnP+RANSAC inlier selection provides the structural geometric-verification layer that catches mid-flight-written misaligned tiles (AC-8.4); rejects poisoned-but-misaligned tiles via low-inlier-count or high-residual-error at the RANSAC step. ALIKED+LightGlue-specific advantage: paper Table IV MHA@3=77.22% vs SP MHA@3=70.19% = +7.03 absolute on homography accuracy → stronger structural cache-poisoning defense via better geometric verification
Restriction "Operational area: eastern/southern Ukraine" — sparse-matcher train-domain match ⚠️ Documentary gap → Verify (D-C2-1 reuse + MODERATE retrain-friendliness) Canonical ALIKED+LightGlue weights are pre-trained on MegaDepth perspective dataset (135 scenes, 1.35M image pairs sampled per DISK methodology) + R2D2 homographic dataset (Oxford-Paris + Aachen synthetic homographies) — same caveat as SP+LightGlue + DISK+LightGlue + C2 candidates; D-C2-1 retrain decision applies to ALIKED+LightGlue identically. ALIKED+LightGlue-specific consideration: MODERATE retrain-friendliness — paper §V sparse NRE loss relaxation reduces GPU memory by ~3.5× vs DISK's RL training; canonical training takes 100K steps over MegaDepth + R2D2 homographic at 800×800 batch 2 with gradient accumulation × 6 — feasible on a single RTX 3090 in ~24 hours (similar cost profile to EigenPlaces's <7 GB VRAM advantage in C2). Paper §VI-C1 documents ALIKED-N(16rot) rotation-augmented variant — directly aligned with UAV multi-heading aerial flights generating multi-rotation training signal. AerialExtreMatch + Derkachi flight required
Restriction "Altitude ≤1 km AGL; terrain assumed flat (rolling steppe / agricultural)" — sparse-matcher scale band match Verify Same as AC-8.6 scale-ratio row; cross-scale matching at the project's altitude band is the AerialExtreMatch cross-scale cell; ALIKED's best-among-single-scale-methods scale invariance (paper §VI-C2) is materially relevant
Restriction "Weather: predominantly sunny ... seasonal/visibility classes" — sparse-matcher cross-season generalization Verify (DOCUMENTARY EVIDENCE on Aachen Day-Night extreme illumination from paper Table VII) Cross-season matching is the dominant aerial-cross-domain failure mode; canonical ALIKED+LightGlue weights are MegaDepth-perspective + R2D2-homographic-trained — D-C2-1 is the primary lever. ALIKED+LightGlue-specific finding: paper Table VII Aachen Day-Night documentary evidence ALIKED-N(32) at (0.25m,2°)/(0.5m,5°)/(5m,10°)=77.6/88.8/100.0 with 2048 keypoints — STRONG cross-illumination geometric robustness (Aachen Day-Night exercises extreme day/night illumination on outdoor visual localization, equivalent in spirit to project's cross-season + cross-conflict aerial conditions). Aerial nadir cross-season + cross-conflict validation unverified — D-C2-1 retrain decision + AerialExtreMatch + Derkachi flight required
Restriction "Navigation camera (pinned): ADTi 20MP, 5472×3648" Pass (API) — same downscale as canonical ALIKED+LightGlue consumes any 1024-largest-edge RGB input; the 5472×3648 → 1024×683 downscale is same aggressiveness as SP+LightGlue + DISK+LightGlue. D-C2-3 input-resolution-shape Plan-phase decision applies identically. Algorithm is resolution-agnostic at API level — preprocess_conf={"resize": 1024} is exposed in canonical ALIKED extractor; project may choose 1280 or 1536 at Jetson MVE time at proportional latency cost (1280 = ~1.6× compute; 1536 = ~2.25× compute)
Restriction "Satellite Imagery — resolution ≥0.5 m/px" — sparse-matcher pipeline at AC-8.1 floor Verify Same as AC-8.1
Restriction "Satellite Imagery — Cache budget: 10 GB" — sparse-matcher cache footprint Pass — NO C3 cache footprint C3 cache footprint is exactly 0 GB — same as SP+LightGlue + DISK+LightGlue; ALIKED+LightGlue operates on UAV-frame + retrieved-tile pair on-the-fly with no pre-cached match-time state. Model weights ~26 MB at fp16 = 0.26% of cache budget loaded once at boot
Restriction "Companion computer: Jetson Orin Nano Super, 8 GB shared" Verify — HARSHER D-C3-2 Jetson export-pathway gate vs DISK+LightGlue / SP+LightGlue; LOW MODEL FOOTPRINT advantage CRITICAL D-C3-2 finding for ALIKED+LightGlue: Source #73 LightGlue-ONNX does NOT ship documented ALIKED end-to-end ONNX/TensorRT pipeline as of January 2026 — Source #73 README changelog lists SuperPoint (28 Jun 2023) + DISK (30 Jun 2023) extractor support only; CLI examples use superpoint positional only; citations cite LightGlue + SuperPoint + DISK papers only. Plus the canonical lightglue/aliked.py uses torchvision.ops.deform_conv2d (BSD-3-Clause inherited from Shiaoming/ALIKED canonical) which is a known-difficult ONNX export op (deformable conv historically requires either ONNX opset ≥19 native DeformConv OR custom TensorRT plugin). Implication for D-C3-2: ALIKED+LightGlue's Jetson runtime path is restricted to (a) PyTorch-fp16 only (DOMINANT path; likely 2-3× slower than DISK+LightGlue's TensorRT pathway); (b) custom ONNX export with deform_conv plugin (significant engineering effort); (c) Torch-TensorRT partial graph compilation with deform_conv falling back to PyTorch-eager (operationally complex on Jetson). Steady-state co-resident memory + GPU-time with C1 + C2 + C4 + C5 + C6 manageable — model footprint advantage compounds (~26 MB at fp16 = lowest C-row component) but PyTorch-fp16-only restriction is the binding constraint
Restriction "License posture (D-C1-1)" — sparse-matcher license-track interaction POSITIVE finding (BSD-3-Clause canonical + Apache-2.0 matcher = clean BSD/permissive throughout) — D-C3-1 SECONDARY-MITIGATION role POSITIVE on canonical Shiaoming/ALIKED: Source #74 LICENSE explicit copyright statement = BSD-3-Clause (Copyright (c) 2022, Zhao Xiaoming) — permissive, BSD/permissive license track. POSITIVE on cvg/LightGlue matcher: Source #70 LICENSE = Apache-2.0 (Copyright 2023 ETH Zurich) — permissive, BSD/permissive license track. CLEAN BSD/permissive license track THROUGHOUT — no Magic Leap noncommercial-research disqualifier (vs SP+LightGlue), no GPL-3.0 copyleft (vs SALAD on C2 row). Under D-C1-1 = (a) GPL-3.0 track, (b) BSD/permissive lock, or (c) keep-both-tracks-open, ALIKED+LightGlue is eligible on every license-posture choice. D-C3-1 SECONDARY-MITIGATION role — ALIKED+LightGlue is the second-cleanest license-compliant alternative to SP+LightGlue's Magic-Leap-restrictive disqualifier, after DISK+LightGlue (D-C3-1 RECOMMENDED-PRIMARY). However, the Jetson export-pathway gap (no LightGlue-ONNX ALIKED support as of January 2026) is the structural disadvantage vs DISK+LightGlue. Recommendation: present D-C1-1 + D-C3-1 + this row to user as a structured Choose block at Plan time; DISK+LightGlue is the cleanest license-compliant + technically-superior + Jetson-deployment-ready C3 choice, ALIKED+LightGlue is the second-cleanest license-compliant choice with the trade-off of PyTorch-fp16-only Jetson runtime

  • Statement: DISK+LightGlue (cvlab-epfl/disk NeurIPS 2020; canonical implementation by Michał J. Tyszkiewicz + Pascal Fua + Eduard Trulls, EPFL CVLab + Google Zurich; cvg/LightGlue port lightglue/disk.py Apache-2.0-inherited from canonical via kornia integration, replaces canonical detect.py + match.py H5-based pipeline with kornia.feature.DISK.from_pretrained("depth") direct PyTorch instantiation per Source #76 + Source #70) is the modern competitive RL-trained sparse-extractor + matcher design point on the Apache-2.0 license track for the C3 row — combining REINFORCE-class policy gradient end-to-end training of detection + description with depth-based reward (paper §4 + Source #77) with cvg/LightGlue's adaptive-depth/adaptive-width sparse-matcher transformer. Per the per-Mode API Capability Verification rule, the project's pinned mode is the (DISK extractor with weights="depth" at 1024-largest-edge RGB input → up to 1024 keypoints with 128-D L2-normalised descriptors + per-keypoint detection scores; canonical 4-layer U-Net architecture with deformable convolutions in bottleneck; image dimensions auto-padded to multiple of 16 via pad_if_not_divisible=True; NMS window size 5, detection threshold 0.0) + (LightGlue matcher with features='disk', n_layers=9, depth_confidence=0.95, width_confidence=0.99, filter_threshold=0.1, flash=True auto-detected, mp=False) → up to 1024 2D-2D correspondences with confidence scores feeding the project's downstream C4 PnP+RANSAC pose estimator. The canonical inference pipeline is identical to SP+LightGlue + ALIKED+LightGlue: extractor.extract(image_query)extractor.extract(image_target)matcher({'image0': feats_q, 'image1': feats_t})rbd()points0 = feats0['keypoints'][matches[..., 0]] and points1 = feats1['keypoints'][matches[..., 1]]. Two separately-cataloged DISK pretrained-weights sibling modes documented in canonical Source #76 + cvg/LightGlue's lightglue/disk.py: DISK-depth (canonical default; trained with depth-based RL reward; reproduces paper Table 1 best results 0.51315 stereo AUC + 0.72705 multiview AUC on IMW2020 test set with 2k features); DISK-epipolar (alternate; trained with epipolar reward; supplementary material variant per canonical paper §6.2). Mode-enumeration query (1/3) — context7 NOT INDEXED + WebFetch fallback PASS: context7 resolve-library-id returned no relevant matches for "DISK" feature extractor (top-results were Disk Inventory X / Expo Build Disk Cache / Blacksmith Sticky Disk / disko NixOS / gptman — all unrelated to feature-matching); per Per-Mode API Capability Verification rule item 2, fall-back to official-docs WebFetch on the canonical cvlab-epfl/disk README + GitHub API license metadata was used (Source #76) plus canonical paper WebFetch (Source #77) plus cvg/LightGlue lightglue/disk.py source-code inspection (transitively via Source #70). Pinned-mode runnable example query (2/3) — WebFetch PASS: Source #76 (canonical cvlab-epfl/disk README) ships canonical inference CLI demos python detect.py --height 1024 --width 1024 --n 2048 h5_artifacts_destination images_directory + python match.py --rt 0.95 --save-threshold 100 h5_artifacts_destination; Source #70 (cvg/LightGlue canonical README) ships the canonical pipeline with a one-line swap to use DISK extractor: from lightglue import LightGlue, DISK; from lightglue.utils import load_image, rbd; extractor = DISK(max_num_keypoints=1024).eval().cuda(); matcher = LightGlue(features='disk').eval().cuda(); image0 = load_image('uav_frame.jpg').cuda(); image1 = load_image('satellite_tile.jpg').cuda(); feats0 = extractor.extract(image0); feats1 = extractor.extract(image1); matches01 = matcher({'image0': feats0, 'image1': feats1}); feats0, feats1, matches01 = [rbd(x) for x in [feats0, feats1, matches01]]; matches = matches01['matches']; points0 = feats0['keypoints'][matches[..., 0]]; points1 = feats1['keypoints'][matches[..., 1]]. Source #71 (cvg/LightGlue paper Appendix A Table 6) documents DISK+LightGlue stereo AUC@5° on IMC 2020 = 67.02 vs SP+LightGlue 59.03 = +7.99 absolute documentary technical superiority + DISK+LightGlue stereo AUC@10° on IMC 2020 = 83.45 vs SP+LightGlue 77.96 = +5.49 absolute — strongest documentary technical-superiority signal vs canonical SP+LightGlue across the project's evaluated C3 candidates. Source #73 (fabio-sim/LightGlue-ONNX) documents DISK end-to-end ONNX export pathway in 30 Jun 2023 changelog entry: "DISK feature extraction support added"; CLI commands parallel SP+LightGlue export (lightglue-onnx export disk_lightglue --num-keypoints 1024 -b 2 -h 1024 -w 1024 --fp16 --device cuda). Disqualifier-probe query (3/3): did NOT surface any documented frame-rate floor (single-pair single-pass inference, parameter-free per-pair besides the model itself); DID surface a documented HIGHEST-RAW-COMPUTE-COST among modern competitive sparse extractors — Source #75 ALIKED paper Table III documents DISK at 98.97 GFLOPs at 640×480 + 1k keypoints / 11.81 FPS RTX 2060 / 1.092M params = 24.4× higher GFLOPs than ALIKED-N(16) (4.05 GFLOPs) + 3.8× higher GFLOPs than SuperPoint (26.11 GFLOPs); did NOT surface any documented memory ceiling at the algorithm level beyond DISK+LightGlue's footprint (DISK 1.092M params + LightGlue 12M params at canonical 9-layer config = ~13.1M params ≈ ~26 MB at fp16 total weights — comparable to SP+LightGlue's ~27 MB and ALIKED+LightGlue's ~26 MB); did NOT surface any Jetson Orin Nano measurement directly (similarly to all C-row components — D-C3-3 deferred Jetson MVE phase will resolve); DID surface HIGHER raw-compute-cost than SP+LightGlue / ALIKED+LightGlue at K=10 pairs/frame: TensorRT-equipped Jetson Orin Nano Super extrapolation ~50-100 ms per pair @ 1024 keypoints fp16 + LightGlue-ONNX TensorRT EP / ~200-400 ms PyTorch-fp16-only fallback; at K=10 retrieval pairs/frame this puts AC-4.1 400 ms budget at MEDIUM-RISK margin (better than ALIKED+LightGlue's PyTorch-fp16-only HARSH-RISK margin since LightGlue-ONNX TensorRT path is available, but worse than SP+LightGlue's TIGHT margin due to DISK's higher raw GFLOPs); DID surface HIGH retrain-cost — canonical RL-policy-gradient training takes ~2 weeks on 32 GB V100s OR ~2 weeks on 12 GB GPUs with low-memory variant (python train.py --substep 2 --batch-size 1 --chunk-size 10000 --warmup 500), vs ALIKED's ~24 hours on RTX 3090 (paper §V sparse NRE loss reduces GPU memory 3.5× vs DISK's RL training). Three POSITIVE structural advantages over canonical SP+LightGlue (Magic-Leap-restrictive HARD-DISQUALIFIER): (i) Apache-2.0 license-track placement THROUGHOUT (Source #76 GitHub API metadata license.spdx_id: "Apache-2.0" on canonical extractor + Source #70 cvg/LightGlue Apache-2.0 on matcher + kornia Apache-2.0 on integration layer = fully clean Apache-2.0 license track on every layer of the DISK+LightGlue stack; CLEANEST license-compliant LightGlue-extractor-sibling vs ALIKED+LightGlue's BSD-3-Clause + Apache-2.0 mixed track and SP+LightGlue's Magic-Leap-restrictive HARD-DISQUALIFIER; eligible on every D-C1-1 license-posture path); (ii) Paper Appendix A Table 6 +7.99 absolute AUC@5° on IMC 2020 stereo over canonical SP+LightGlue + +5.49 absolute AUC@10° = demonstrably technically superior to canonical SP+LightGlue on phototourism stereo per Source #71 paper documentation; (iii) LightGlue-ONNX TensorRT export pathway PRESENT (Source #73 30 Jun 2023 changelog entry "DISK feature extraction support added" + parallel CLI commands to SP+LightGlue) — DISK+LightGlue is the second-cleanest LightGlue-extractor-sibling for Jetson deployment after SP+LightGlue (which has the most-mature ONNX/TensorRT pathway via 28 Jun 2023 changelog) but before ALIKED+LightGlue (export-absent in LightGlue-ONNX as of January 2026). Three POSITIVE structural advantages over ALIKED+LightGlue (D-C3-1 SECONDARY-MITIGATION): (iv) Jetson-deployment-ready via LightGlue-ONNX TensorRT pathway (DISK has documented Jun 2023 changelog support; ALIKED has NO LightGlue-ONNX support as of January 2026; DISK can leverage TensorRT acceleration ~3-5× speedup over PyTorch fp16); (v) Higher #matches per pair (paper Table V Stereo NM=2048 for DISK vs 1934.2 for ALIKED-N(16) = +19.4% more matches — critical for reducing C4 PnP+RANSAC failure rate when high-overlap UAV-vs-cached-tile pairs require high inlier counts); (vi) Higher MMA@3 on HPatches (paper Table III: DISK 77.59 vs ALIKED-N(16) 74.43 = +3.16 absolute — slightly better per-pixel matching accuracy). Three NEGATIVE structural findings vs ALIKED+LightGlue: (vii) Higher raw GFLOPs at competitive accuracy — DISK 98.97 GFLOPs vs ALIKED-N(16) 4.05 GFLOPs = 24.4× higher GFLOPs (LightGlue-ONNX TensorRT pathway partially mitigates but does not eliminate this gap); (viii) Lower MHA@3 on HPatches homography accuracy (paper Table III: DISK 70.56 vs ALIKED-N(16) 77.22 = -6.66 absolute — DISK's evenly-distributed dense keypoints give weaker geometric verification accuracy); (ix) Worse Aachen Day-Night relocalization (paper Table VII at 2048 keypoints / 0.25m,2°: DISK 70.4 vs ALIKED-N(32) 77.6 = -7.2 absolute; DISK is stronger on phototourism stereo but ALIKED is stronger on visual-localization; the project's intended pipeline is closer to visual-localization than to phototourism stereo — UAV-vs-satellite-tile registration with cross-season + cross-conflict imagery is structurally more like Aachen Day-Night than IMC 2020 stereo). Pinned-mode sentence: "We will use DISK+LightGlue with DISK extractor with weights='depth' (canonical depth-based-RL-reward checkpoint) at 1024-largest-edge RGB input + up to 1024 keypoints with 128-D L2-normalised descriptors + LightGlue matcher with features='disk', n_layers=9, depth_confidence=0.95, width_confidence=0.99, filter_threshold=0.1, flash=True at 1024×1024 RGB input per image (auto-padded to multiple of 16 via pad_if_not_divisible=True; auto-converted from grayscale via kornia.color.grayscale_to_rgb) (canonical cvg/LightGlue DISK port + canonical cvlab-epfl/disk save-depth.pth pretrained weights distributed via kornia model registry), with inputs {1× ADTi 20MP nav frame stream → bilinearly downscaled-to-largest-edge 1024 + 1× cached satellite tile per top-K retrieval result from C2} and expect outputs {up to 1024 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair, feeding C4 PnP+RANSAC with cosine confidence threshold filter at 0.95 × per-pair-max-score} on Jetson Orin Nano Super (8 GB shared, JetPack 6, ROS 2 Humble; **PyTorch fp16 baseline as fallback runtime + LightGlue-ONNX + TensorRT EP as DOMINANT runtime path** via Source #73 30 Jun 2023 changelog DISK end-to-end ONNX export support; alternatively pure TensorRT via lightglue-onnx trtexec Polygraphy-based pathway; FP8 ModelOpt path with Jetson Ampere FP8 emulation verification gate at MVE phase per D-C3-2). D-C3-1 RECOMMENDED-PRIMARY-MITIGATION role per engine Component Option Breadth rule — DISK+LightGlue is the cleanest license-compliant + technically-superior + Jetson-deployment-ready C3 choice vs canonical SP+LightGlue's Magic-Leap-restrictive HARD-DISQUALIFIER and ALIKED+LightGlue's PyTorch-fp16-only Jetson runtime restriction; the Apache-2.0-throughout placement + paper Table 6 +7.99-absolute-AUC@5° superiority + LightGlue-ONNX TensorRT-pathway-present compounds into the strongest documentary case for D-C3-1 RECOMMENDED-PRIMARY-MITIGATION lock at Plan-phase decision."
  • Source: Source #76 (cvlab-epfl/disk canonical README + GitHub API license metadata — Apache-2.0; two pretrained checkpoints save-depth.pth + save-epipolar.pth; canonical inference CLIs python detect.py --height 1024 --width 1024 --n 2048 + python match.py --rt 0.95 --save-threshold 100; cvg/LightGlue lightglue/disk.py Apache-2.0-inherited via kornia integration with kornia.feature.DISK.from_pretrained("depth"); LightGlue-ONNX DISK-export-PRESENT finding from Source #73 30 Jun 2023 changelog), Source #77 (canonical paper arXiv:2006.13566 / Tyszkiewicz et al. NeurIPS 2020 — §3 architecture [4-layer U-Net + deformable bottleneck + per-pixel dense descriptor head + per-pixel scoring head] + §4 method [REINFORCE-class policy gradient + depth-based reward + inverse_T = θ_M matching temperature scheduling annealed 15→50 over 20 epochs] + §5 experiments [HPatches MMA@3 Figure 5 + IMW2020 stereo + multiview AUC Table 1 best single-extractor result at 2020 publication time; canonical schedule produces 0.51315 stereo AUC + 0.72705 multiview AUC at 2k features] + §6 limitations [computationally expensive RL training ~2 weeks on 32 GB V100; ~2 weeks at smaller batch on 12 GB low-memory variant]), Source #71 (cvg/LightGlue canonical paper Appendix A Table 6 cross-cite — DISK+LightGlue stereo AUC@5° on IMC 2020 = 67.02 vs SP+LightGlue 59.03 = +7.99 absolute + DISK+LightGlue stereo AUC@10° on IMC 2020 = 83.45 vs SP+LightGlue 77.96 = +5.49 absolute = strongest documentary technical-superiority signal for D-C3-1 RECOMMENDED-PRIMARY-MITIGATION lock), Source #70 (cvg/LightGlue canonical README cross-cite — LightGlue(features='disk') mode wiring with input_dim=128; from lightglue import DISK extractor class import; transitive citation for the cvg/LightGlue port file lightglue/disk.py), Source #73 (fabio-sim/LightGlue-ONNX companion cross-cite — DISK end-to-end ONNX export pathway PRESENT: 30 Jun 2023 changelog "DISK feature extraction support added"; CLI commands parallel SP+LightGlue export with lightglue-onnx export disk_lightglue --num-keypoints 1024 -b 2 -h 1024 -w 1024 --fp16 --device cuda and inference via lightglue-onnx infer disk_lightglue --image image1.jpg --image image2.jpg -d tensorrt --fp16), Source #75 ALIKED paper cross-cite — Table III documents DISK at 1.092M params / 98.97 GFLOPs / 11.81 FPS RTX 2060 / MMA@3=77.59% / MHA@3=70.56% at 640×480 + 1k keypoints; Table V documents DISK Stereo NM=2048 / mAA(5°)=44.80 / mAA(10°)=85.20 + Multiview NL=2424.8 / mAA(5°)=38.72 / mAA(10°)=51.22 / TL=5.50 with PPC_stereo=0.52 (24.8× lower than ALIKED-N(16)'s 12.91); Table VII documents DISK Aachen Day-Night at 2048 keypoints / mNN matcher = 70.4/82.7/94.9 at (0.25m,2°)/(0.5m,5°)/(5m,10°) — beats SuperPoint at strictest tier by +1.0 absolute but loses to ALIKED-N(32) by -7.2 absolute
  • Phase: Phase 2
  • Target Audience: System architects + C3 implementer + C4 (PnP+RANSAC) implementer + C7 (Jetson runtime) implementer + Step-7.5 reviewer + license-posture decision-maker (D-C1-1 + D-C3-1 RECOMMENDED-PRIMARY-MITIGATION lock) + Jetson-deployment decision-maker (D-C3-2 with PREFERRED ONNX Runtime + TensorRT EP path for DISK+LightGlue)
  • Confidence: for mode-enumeration (two canonical pretrained-weights sibling modes save-depth.pth + save-epipolar.pth + LightGlue matcher integration via features='disk'), runnable-example (canonical cvlab-epfl/disk demo CLIs + cvg/LightGlue port one-liner via kornia integration), parameter-count (DISK 1.092M params + LightGlue 12M params = ~13.1M total ≈ ~26 MB at fp16), license (Apache-2.0 confirmed via GitHub API metadata license.spdx_id: "Apache-2.0" on canonical extractor + Apache-2.0 on cvg/LightGlue matcher + Apache-2.0 on kornia integration layer = fully clean Apache-2.0 license track throughout the entire stack); for documentary RTX-2060 throughput baseline (DISK 11.81 FPS @ 640×480 + 1k keypoints per ALIKED paper Table III), HPatches MMA@3=77.59% / MHA@3=70.56% at 1k keypoints, IMW2020 stereo + multiview AUC documentary (canonical paper Table 1 + ALIKED paper Table V cross-cite), Aachen Day-Night documentary at 2048 keypoints + mNN matcher (per ALIKED paper Table VII cross-cite); for paper Appendix A Table 6 documentary technical superiority over canonical SP+LightGlue on IMC 2020 stereo (DISK+LightGlue AUC@5°=67.02 vs SP+LightGlue 59.03 = +7.99 absolute + AUC@10°=83.45 vs SP+LightGlue 77.96 = +5.49 absolute — strongest documentary signal for D-C3-1 RECOMMENDED-PRIMARY-MITIGATION lock); for LightGlue-ONNX DISK export pathway PRESENT (Source #73 30 Jun 2023 changelog + parallel CLI commands to SP+LightGlue + 11 Jul 2023 mixed-precision + 19 Jul 2023 TensorRT support); ⚠️ for Jetson Orin Nano Super deployment latency / memory / accuracy (no documentary measurement — Jetson MVE will resolve via D-C3-3); ⚠️ for DISK 98.97 GFLOPs HIGHEST among modern competitive sparse extractors — extrapolated Jetson Orin Nano Super latency at K=10 pairs ~50-100 ms per pair fp16 + LightGlue-ONNX TensorRT EP standard / ~200-400 ms PyTorch-fp16-only fallback (HIGHER than SP+LightGlue's 30-60 ms standard / HIGHER than ALIKED+LightGlue's PyTorch-fp16-only 70-140 ms); ⚠️ for DISK RL-policy-gradient training cost (~2 weeks on 32 GB V100 OR ~2 weeks at smaller batch on 12 GB low-memory variant; vs ALIKED's ~24 hours on RTX 3090 = DISK is less retrain-friendly than ALIKED at the GPU-memory level for D-C2-1 = (a) project-domain retrain decision; vs SP-reproduction which would require Magic-Leap's Homographic Adaptation training pipeline + LICENSE clearance = DISK is more retrain-friendly than SP-reproduction); for canonical-checkpoint aerial-domain fitness (canonical training on EPFL CVLab DISK dataset ~164 GB sampled from MegaDepth phototourism scenes with depth-map supervision — NOT aerial nadir; same caveat as SP+LightGlue + ALIKED+LightGlue + C2 candidates, D-C2-1 reuse); for clean Apache-2.0 license track throughout (eligible on every D-C1-1 license-posture path; no Magic Leap noncommercial-research disqualifier applies; no GPL-3.0 copyleft applies); for COLMAP integration (colmap/colmap2dataset.py) directly applicable to D-C2-1 = (a) project-side aerial-domain retrain workflow on AerialVL + Derkachi-flight scenes
  • Related Dimension: SQ3+SQ4 / C3 modern competitive RL-policy-gradient sparse-extractor + matcher candidate (D-C3-1 RECOMMENDED-PRIMARY-MITIGATION role) — per-mode API capability verification gate
  • Fit Impact: DOCUMENTARY PASS for the per-mode API capability verification gate — DISK+LightGlue has a documented runnable per-mode example with the project's pinned configuration (canonical cvlab-epfl/disk + cvg/LightGlue DISK port via kornia integration + canonical paper algorithmic specification), two documented DISK pretrained-weights sibling modes (DISK-depth canonical default + DISK-epipolar alternate), and no API-level disqualifier. Three POSITIVE structural findings vs all prior C-row components: (i) FULLY CLEAN APACHE-2.0 LICENSE TRACK THROUGHOUT — Apache-2.0 on canonical cvlab-epfl/disk extractor + Apache-2.0 on cvg/LightGlue matcher + Apache-2.0 on kornia integration layer = CLEANEST license-compliant LightGlue-extractor-sibling in the project's evaluated C3 candidate space, and the strongest documentary case for D-C3-1 RECOMMENDED-PRIMARY-MITIGATION lock vs SP+LightGlue's Magic-Leap-restrictive HARD-DISQUALIFIER. (ii) PAPER APPENDIX A TABLE 6 +7.99 ABSOLUTE AUC@5° ON IMC 2020 STEREO OVER CANONICAL SP+LIGHTGLUE — DISK+LightGlue is the demonstrably technically-superior LightGlue-extractor-sibling on phototourism stereo per Source #71 documentation (cvg/LightGlue paper itself). (iii) LIGHTGLUE-ONNX TENSORRT EXPORT PATHWAY PRESENT — Source #73 30 Jun 2023 changelog explicitly supports DISK end-to-end ONNX export with parallel CLI commands to SP+LightGlue; DISK+LightGlue is the second-cleanest Jetson-deployment-ready LightGlue-extractor-sibling after SP+LightGlue, before ALIKED+LightGlue (export-absent in LightGlue-ONNX). HOWEVER, three NEGATIVE structural findings vs ALIKED+LightGlue (D-C3-1 SECONDARY-MITIGATION): (iv) HIGHER raw GFLOPs at competitive accuracy — DISK 98.97 GFLOPs vs ALIKED-N(16) 4.05 GFLOPs = 24.4× higher GFLOPs (LightGlue-ONNX TensorRT pathway partially mitigates ~3-5× speedup over PyTorch fp16, but does not eliminate the raw-GFLOPs gap); on Jetson Orin Nano Super extrapolation DISK+LightGlue with TensorRT ≈ 50-100 ms per pair vs ALIKED+LightGlue PyTorch-fp16-only ≈ 70-140 ms per pair (DISK with TensorRT acceleration is faster than ALIKED without TensorRT, but DISK without TensorRT acceleration is slower than ALIKED — confirms that the LightGlue-ONNX TensorRT pathway is the critical D-C3-2 deployment-runtime decision lever for DISK+LightGlue's competitive Jetson latency story). (v) Lower MHA@3 on HPatches homography accuracy — DISK 70.56 vs ALIKED-N(16) 77.22 = -6.66 absolute (DISK's evenly-distributed dense keypoints give weaker geometric verification accuracy at the per-pixel level); (vi) Worse Aachen Day-Night relocalization at strictest tier — DISK at (0.25m,2°)=70.4 vs ALIKED-N(32)=77.6 = -7.2 absolute (DISK is stronger on phototourism stereo while ALIKED is stronger on visual-localization; the project's intended pipeline is closer to visual-localization than to phototourism stereo). One ADDITIONAL CONSIDERATION: (vii) HIGH RL-policy-gradient training cost — canonical training takes ~2 weeks on 32 GB V100 OR ~2 weeks at smaller batch on 12 GB low-memory variant; vs ALIKED's ~24 hours on RTX 3090 = DISK is materially less retrain-friendly than ALIKED at the GPU-memory + wall-clock level for D-C2-1 = (a) project-domain retrain decision; for the project's D-C2-1 retrain-vs-canonical-checkpoint trade-off, ALIKED's sparse NRE loss training paradigm is the cheaper retrain pathway, while DISK's RL-policy-gradient training is the more expensive but better-documented training pathway (paper §4 + canonical README download_dataset script + colmap/colmap2dataset.py workflow). NEW Plan-phase decision raised by DISK+LightGlue closure (will be tagged D-C3-5): D-C3-5 (NEW) DISK-pretrained-weights-choice (save-depth.pth canonical default / save-epipolar.pth alternate / project-domain retrain on aerial nadir corpus) — Plan-phase decision; canonical paper §6 documents save-depth.pth as best-performing default variant + save-epipolar.pth as supplementary-material alternate; for the project's pinned UAV-vs-satellite-tile registration use case, save-depth.pth is the recommended canonical default (strongest documentary IMW2020 stereo + multiview AUC numbers + documented Aachen Day-Night transitive lift via ALIKED paper Table VII cross-cite), with save-epipolar.pth as a fallback if depth-map ground-truth is unavailable for aerial-domain retrain (paper §4 epipolar reward variant trades 0.5-1 absolute AUC for not requiring depth maps). REUSE of D-C2-1 (aerial-domain training): applies identically to DISK+LightGlue as to all C-row components; canonical training on MegaDepth phototourism + depth-map supervision is NOT aerial nadir; D-C2-1 retrain decision interacts with D-C3-1 extractor choice — DISK+LightGlue retrain is well-documented but materially expensive (~2 weeks on 32 GB V100 / ~2 weeks at smaller batch on 12 GB; canonical colmap/colmap2dataset.py workflow allows direct import from COLMAP-processed AerialVL or Derkachi-flight scenes; paper §6.4 low-GPU-memory training option python train.py --substep 2 --batch-size 1 --chunk-size 10000 --warmup 500 documented to fit within 11/12 GB GPUs). C3 mandatory pre-screen status: DISK+LightGlue closes the C3 mandatory pre-screen at 3 of N candidates (SP+LightGlue at 1/N from prior session + ALIKED+LightGlue at 2/N + DISK+LightGlue at 3/N this session). The deferred Jetson Orin Nano Super hardware MVE phase still gates final accuracy/latency/memory measurement (D-C1-2 + D-C3-3) — DISK+LightGlue's measurement role on the Jetson is to establish the modern competitive RL-trained sparse-matcher reference baseline on the FULLY-CLEAN-APACHE-2.0 license track with TensorRT-equipped deployment, against which D-C3-1 SECONDARY-MITIGATION ALIKED+LightGlue (BSD-3-Clause + Apache-2.0, PyTorch-fp16-only) and other C3 candidates (XFeat, SuperGlue+SuperPoint, etc.) are scored on the project's specific operating context (aerial nadir, 1 km AGL, eastern/southern Ukraine cross-season, AC-4.1 + AC-4.2 + AC-8.3 budgets). License: Apache-2.0 for canonical cvlab-epfl/disk (per Source #76 GitHub API metadata) + Apache-2.0 for cvg/LightGlue matcher (per Source #70 LICENSE) + Apache-2.0 for kornia integration layer = clean Apache-2.0 license track throughout; no Magic Leap noncommercial-research disqualifier applies (vs canonical SP+LightGlue's Magic Leap restrictive license disqualifier); no BSD-3-Clause / Apache-2.0 mixed-track caveat applies (vs ALIKED+LightGlue's mixed BSD-3-Clause + Apache-2.0 track).

C3 — Per-Mode API Capability Verification (engine Step 2 — DISK+LightGlue session entry, 2026-05-08)

  • Source: Source #76 (cvlab-epfl/disk canonical README + GitHub API license metadata — python detect.py --height 1024 --width 1024 --n 2048 h5_artifacts_destination images_directory for canonical pretrained inference, two pretrained checkpoints save-depth.pth + save-epipolar.pth distributed via canonical repo + auto-download via kornia model registry, Apache-2.0 confirmed via GitHub API license.spdx_id: "Apache-2.0"; cvg/LightGlue lightglue/disk.py Apache-2.0 inheritance via kornia integration with kornia.feature.DISK.from_pretrained("depth")), accessed 2026-05-08; Source #77 (canonical paper arXiv:2006.13566 / Tyszkiewicz et al. NeurIPS 2020 — §3 architecture [4-layer U-Net + deformable bottleneck + per-pixel dense descriptor head + per-pixel scoring head] + §4 method [REINFORCE-class policy gradient + depth-based reward + inverse_T = θ_M matching temperature scheduling annealed 15→50 over 20 epochs] + §5 experiments [HPatches MMA@3 Figure 5 + IMW2020 stereo + multiview AUC Table 1 best single-extractor result at 2020 publication time] + §6 limitations + §6.4 low-GPU-memory training option python train.py --substep 2 --batch-size 1 --chunk-size 10000 --warmup 500); Source #71 (cvg/LightGlue canonical paper Appendix A Table 6 cross-cite — DISK+LightGlue stereo AUC@5° on IMC 2020 = 67.02 vs SP+LightGlue 59.03 = +7.99 absolute + DISK+LightGlue stereo AUC@10° on IMC 2020 = 83.45 vs SP+LightGlue 77.96 = +5.49 absolute); Source #70 (cvg/LightGlue canonical README — from lightglue import LightGlue, DISK; matcher = LightGlue(features='disk').eval().cuda(); transitive citation for lightglue/disk.py Apache-2.0 + kornia integration); Source #73 (fabio-sim/LightGlue-ONNX companion — DISK end-to-end ONNX export pathway PRESENT: changelog 30 Jun 2023 "DISK feature extraction support added"; CLI commands parallel SP+LightGlue export); Source #75 ALIKED paper Table III + V + VII cross-cite (DISK 1.092M params / 98.97 GFLOPs / 11.81 FPS RTX 2060 / MMA@3=77.59% / MHA@3=70.56% / IMW2020 Stereo NM=2048 / mAA(10°)=85.20 / Aachen Day-Night at 2048 keypoints / mNN = 70.4/82.7/94.9)
  • Inputs in the example: Two arbitrary RGB or grayscale images at any (independent) resolutions; canonical demo uses arbitrary image directories; load_image returns torch.Tensor[3, H, W] normalized to [0, 1]; DISK extractor requires RGB input — auto-converts grayscale via kornia.color.grayscale_to_rgb per lightglue/disk.py lines 3132; image dimensions must be multiple of 16 (auto-padded preserving aspect ratio via pad_if_not_divisible=True in cvg/LightGlue port); DISK extractor cropped output: feats: {keypoints: torch.Tensor[B, N, 2], descriptors: torch.Tensor[B, N, 128], keypoint_scores: torch.Tensor[B, N]} where N ≤ max_num_keypoints (canonical default None for threshold-based detection; project pinned to 1024); LightGlue matcher input: dict with image0 and image1 keys mapping to per-image DISK output dicts; output: {matches0: torch.Tensor[B, N], matches1: torch.Tensor[B, N], matching_scores0: torch.Tensor[B, N], matching_scores1: torch.Tensor[B, N], matches: List[torch.Tensor[K, 2]], scores: List[torch.Tensor[K]], stop: int} where K is the number of correspondences after τ=0.1 filtering; rbd(x) removes batch dim
  • Outputs in the example: Up to 1024 2D-2D correspondences with per-correspondence confidence score s_k ∈ [τ=0.1, 1.0]; canonical paper Table 1 reports IMW2020 stereo AUC=0.51315 / multiview AUC=0.72705 with 2k features (canonical paper schedule, best single-extractor result at 2020 publication time); ALIKED paper Table III reports HPatches MMA@3=77.59% / MHA@3=70.56% with 1k keypoints (LightGlue would lift these by 5-10 absolute per Source #71 LightGlue paper documentary evidence); ALIKED paper Table V reports IMW-test Stereo mAA(5°)=44.80 / mAA(10°)=85.20 / NM=2048 with 2048 keypoints + ratio-test matcher; ALIKED paper Table VII reports Aachen Day-Night at 2048 keypoints / mNN matcher = 70.4/82.7/94.9 at (0.25m,2°)/(0.5m,5°)/(5m,10°) — beats SuperPoint at strictest tier by +1.0 absolute, loses to ALIKED-N(32) by -7.2 absolute; CRITICAL CROSS-PAPER RESULT: cvg/LightGlue paper Source #71 Appendix A Table 6 documents DISK+LightGlue IMC 2020 stereo AUC@5°=67.02 vs SP+LightGlue 59.03 = +7.99 absolute + AUC@10°=83.45 vs SP+LightGlue 77.96 = +5.49 absolute (strongest documentary technical-superiority signal for D-C3-1 RECOMMENDED-PRIMARY-MITIGATION lock); canonical RTX-2060 throughput (ALIKED paper Table III): DISK 11.81 FPS @ 640×480 + 1k keypoints = 84.7 ms per pair extraction-only (slowest among modern competitive sparse extractors; LightGlue-ONNX TensorRT acceleration partially mitigates via 3-5× speedup at fp16)
  • Project inputs: 1× ADTi 20MP nav frame stream (5472×3648, target 3 fps) → bilinearly downscaled-to-largest-edge 1024 → grayscale-converted (or RGB-preserved per project's nav-camera config) → fp16 batch on Jetson Orin Nano Super (auto-padded to multiple of 16 via pad_if_not_divisible=True); per-UAV-frame K=10 top-K retrieved satellite tiles from C2 → bilinearly downscaled-to-largest-edge 1024 → grayscale-or-RGB → fp16 batch on Jetson Orin Nano Super; total per-frame compute = K=10 image pairs (UAV-frame, satellite-tile)
  • Project outputs required: Up to 1024 2D-2D correspondences per (UAV-frame, satellite-tile) image pair with confidence scores; cosine-confidence-threshold filter at 0.95 × per-pair-max-score to retain only the most confident correspondences; feeds C4 PnP+RANSAC pose estimator with 4-point minimum; satisfies AC-1.1 frame-center-within-50m pose accuracy requirement when pairing with high-recall C2 retrieval (paper Table 6 IMC 2020 documentary evidence DISK+LightGlue stereo AUC@5°=67.02 = nominally satisfies AC-1.1 50m bar at the stereo-AUC level; LightGlue lifts further); satisfies AC-1.2 frame-center-within-20m at tighter tolerance (paper Table 6 DISK+LightGlue stereo AUC@10°=83.45 = comfortably satisfies AC-1.2 20m bar); satisfies AC-2.1b satellite-anchor-registration-succeeds gate when C3 image pair achieves >30 inliers after RANSAC; MEDIUM-RISK latency-budget interaction: DISK 98.97 GFLOPs at 640×480 + 1k keypoints (24.4× higher than ALIKED-N(16); 3.8× higher than SuperPoint) → at K=10 pairs × extraction (~50-100 ms TensorRT-equipped Jetson Orin Nano Super extrapolation / ~200-400 ms PyTorch-fp16-only fallback) + matching (~30-50 ms with adaptive depth) = ~500-1500 ms per UAV frame TensorRT-equipped / 1500-4500 ms PyTorch-fp16-only; TIGHT TO HARSH against AC-4.1 400 ms budget on PyTorch-fp16 path; MEDIUM-RISK on TensorRT path; the LightGlue-ONNX TensorRT pathway (Source #73) is the critical D-C3-2 deployment-runtime decision lever for DISK+LightGlue's competitive Jetson latency story; satisfies AC-4.2 memory budget with comfortable margin (~26 MB total weights at fp16, comparable to SP+LightGlue + ALIKED+LightGlue)
  • Match assessment: exact mode match for (DISK extractor with weights='depth' canonical default at 1024-largest-edge RGB input, 1024 max keypoints, 128-D descriptors, LightGlue matcher with features='disk', n_layers=9, depth_confidence=0.95, width_confidence=0.99, filter_threshold=0.1, flash=True, up to 1024 2D-2D correspondences output with confidence scores); training+evaluation+canonical-pretrained-distribution CLIs exist in cvlab-epfl/disk (Source #76) AND in cvg/LightGlue DISK port via kornia integration (Source #70); two DISK pretrained-weights sibling modes documented (DISK-depth canonical default / DISK-epipolar supplementary-material alternate); companion cvg/Hierarchical-Localization (hloc) ships canonical NetVLAD top-50 → SuperPoint+LightGlue → PnP+RANSAC pipeline (transitive applicability to DISK+LightGlue via features='disk' swap); paper Appendix A Table 6 documentary technical superiority over canonical SP+LightGlue on IMC 2020 stereo (+7.99 absolute AUC@5° + +5.49 absolute AUC@10° = strongest documentary signal in the project's evaluated C3 candidate space); ⚠️ partial input domain (canonical training on EPFL CVLab DISK dataset ~164 GB sampled from MegaDepth phototourism with depth-map supervision — NOT aerial nadir; same caveat as SP+LightGlue + ALIKED+LightGlue; D-C2-1 retrain decision applies; canonical colmap/colmap2dataset.py workflow allows direct import from COLMAP-processed AerialVL or Derkachi-flight scenes for project-side aerial retrain, but cost is ~2 weeks on 32 GB V100 / ~2 weeks at smaller batch on 12 GB low-memory variant); MEDIUM-RISK D-C3-2 Jetson export-pathway gate: Source #73 (fabio-sim/LightGlue-ONNX) DOES ship documented DISK end-to-end ONNX/TensorRT pipeline as of January 2026 (30 Jun 2023 changelog entry "DISK feature extraction support added"; CLI commands parallel SP+LightGlue with lightglue-onnx export disk_lightglue --num-keypoints 1024 -b 2 -h 1024 -w 1024 --fp16 --device cuda); LightGlue-ONNX TensorRT pathway provides 3-5× speedup over PyTorch fp16 → DISK+LightGlue Jetson runtime at ~50-100 ms per pair fp16 + TensorRT EP standard / ~200-400 ms PyTorch-fp16-only fallback; TensorRT-equipped Jetson Orin Nano Super extrapolation puts AC-4.1 400 ms budget at MEDIUM-RISK margin (better than ALIKED+LightGlue's PyTorch-fp16-only HARSH-RISK margin since LightGlue-ONNX TensorRT path is available; worse than SP+LightGlue's TIGHT margin due to DISK's higher raw GFLOPs of 98.97 vs SP's 26.11); ⚠️ for Jetson Orin Nano Super latency / memory / accuracy on TensorRT path (no documentary measurement — Jetson MVE will resolve via D-C3-3); for fully clean Apache-2.0 license-track placement THROUGHOUT = CLEANEST license-compliant LightGlue-extractor-sibling, NO Magic Leap noncommercial-research disqualifier, NO BSD-3-Clause + Apache-2.0 mixed-track caveat
  • If ⚠️ or : docs do not explicitly disqualify the algorithmic mode at the API or capability level. The (extractor, matcher, keypoint count, descriptor dimension, input size, normalisation, output shape) tuple is documented and runnable directly via cvlab-epfl/disk canonical CLI OR via cvg/LightGlue DISK port via kornia integration. HOWEVER, DISK 98.97 GFLOPs HIGHEST raw-compute-cost among modern competitive sparse extractors creates a MEDIUM-RISK D-C3-2 Jetson deployment gate vs ALIKED+LightGlue's smaller-GFLOPs profile (LightGlue-ONNX TensorRT pathway partially mitigates via 3-5× speedup over PyTorch fp16 but does not eliminate the raw-GFLOPs gap): project's Jetson runtime path for DISK+LightGlue PREFERS (a) ONNX Runtime + TensorRT EP via Source #73 (DOMINANT path for AC-4.1 latency budget satisfaction; ~50-100 ms per pair Jetson Orin Nano Super extrapolation), with (b) pure TensorRT via lightglue-onnx trtexec Polygraphy-based pathway as alternate (similar latency profile), and (c) PyTorch-fp16 baseline as fallback (significantly slower at ~200-400 ms per pair, fails AC-4.1 budget at K=10 pairs/frame). → Status: Documentary lead with FULLY-CLEAN-APACHE-2.0 license track THROUGHOUT + PAPER-TABLE-6-+7.99-ABSOLUTE-AUC@5°-DOCUMENTARY-TECHNICAL-SUPERIORITY-OVER-CANONICAL-SP+LIGHTGLUE + LIGHTGLUE-ONNX-TENSORRT-EXPORT-PATHWAY-PRESENT (Source #73 30 Jun 2023 changelog) + 98.97-GFLOPS-HIGHEST-RAW-COMPUTE-COST CAVEAT (MEDIUM-RISK D-C3-2 mitigation via TensorRT acceleration) + RL-POLICY-GRADIENT-TRAINING-RETRAIN-COST CAVEAT (~2 weeks on 32 GB V100 vs ALIKED's ~24 hours on RTX 3090) + aerial-domain-training caveat (D-C2-1 reuse) + D-C3-5 NEW DISK-pretrained-weights-choice Plan-phase decision, Apache-2.0 track throughout. Final lead promotion to "Selected" or "Conditional RECOMMENDED-PRIMARY-mitigation" deferred to D-C3-1 + D-C3-2 + D-C3-3 + D-C3-5 + D-C1-2 + D-C2-4 dedicated Jetson Orin Nano Super hardware MVE phase. Per the engine Component Option Breadth rule, DISK+LightGlue closes the C3 mandatory pre-screen at 3 of N candidates (SP+LightGlue + ALIKED+LightGlue + DISK+LightGlue) with the canonical RL-trained sparse-extractor + matcher reference baseline on the FULLY-CLEAN-APACHE-2.0 license track; subsequent C3 candidates (XFeat, SuperGlue+SuperPoint mandatory simple-baseline, DoGHardNet+LightGlue, etc.) will be separately-cataloged in subsequent sessions.

C3 — Per-numbered-Restriction × Per-numbered-AC Sub-Matrix per Candidate (DISK+LightGlue addition)

DISK+LightGlue — per-numbered binding (C3-relevant lines only; cross-cutting N/A above also apply identically)

Cells share the legend defined under the MixVPR sub-matrix (C2). Where a binding is identical in both substance and evidence to the SP+LightGlue or ALIKED+LightGlue rows, the DISK+LightGlue row points to those rows to avoid restating; where DISK+LightGlue's pinned mode produces a materially different binding (Apache-2.0-throughout license track vs SP+LightGlue's Magic-Leap-restrictive disqualifier and ALIKED+LightGlue's BSD-3-Clause + Apache-2.0 mixed track, paper Table 6 +7.99-absolute-AUC@5° documentary technical superiority over canonical SP+LightGlue, LightGlue-ONNX TensorRT-export-pathway-PRESENT vs ALIKED+LightGlue's TensorRT-export-pathway-ABSENT, 98.97-GFLOPS-HIGHEST-raw-compute-cost), the DISK+LightGlue row carries a distinct evidence cite.

Line Binding Evidence (one-line cite)
AC-1.1 (frame-center within 50 m, ≥80% normal-flight photos) Pass (documentary on IMC 2020 stereo Table 6) → Verify (aerial nadir cross-domain) Source #71 paper Appendix A Table 6 documents DISK+LightGlue stereo AUC@5°=67.02 vs SP+LightGlue 59.03 = +7.99 absolute on IMC 2020 stereo — strongest documentary signal for AC-1.1 frame-center-within-50m at the stereo-AUC level. Source #75 ALIKED paper Table VII documents DISK Aachen Day-Night at 2048 keypoints / mNN at (0.25m,2°)=70.4 (vs SuperPoint=69.4 = +1.0 absolute); transitive lineage with Source #71 paper §5.4 LightGlue lift over mNN (~10-15 absolute) suggests expected DISK+LightGlue Aachen Day accuracy ≈ 80-85% at strictest tier — competitive with SP+LightGlue's 89.2 but lower than ALIKED+LightGlue's expected approach to 89.2+. Aerial nadir cross-domain validation required at Jetson MVE on AerialExtreMatch + Derkachi flight. D-C2-1 reuse: canonical training on MegaDepth phototourism + depth-map supervision is NOT aerial nadir; aerial-domain retrain on AerialVL is well-documented (canonical colmap/colmap2dataset.py workflow) but materially expensive (~2 weeks on 32 GB V100)
AC-1.2 (frame-center within 20 m, ≥50% normal-flight photos) Pass (documentary on IMC 2020 stereo Table 6) → Verify (aerial nadir cross-domain tighter tail) Same as AC-1.1, tighter tail; paper Table 6 DISK+LightGlue stereo AUC@10°=83.45 vs SP+LightGlue 77.96 = +5.49 absolute — comfortably satisfies AC-1.2 20m bar at the stereo-AUC level. DISK+LightGlue-specific advantage over SP+LightGlue: Source #75 ALIKED paper Table V documents DISK Stereo NM=2048 / Multiview NL=2424.8 vs ALIKED-N(16) Stereo NM=1934.2 / Multiview NL=1975.4 = +5.7% / +22.6% more matches than ALIKED — critical for AC-1.2 tight-tail registration where high inlier counts reduce C4 PnP+RANSAC failure rate. Trade-off vs ALIKED+LightGlue: DISK has -6.66 absolute MHA@3 on HPatches homography accuracy (paper Table III: DISK 70.56 vs ALIKED-N(16) 77.22) — DISK's evenly-distributed dense keypoints give weaker per-pixel geometric verification accuracy, but higher #matches partially compensates at the tight-tail of the precision distribution
AC-2.1b (satellite-anchor registration succeeds, AC-1.1/1.2 + AC-2.2 + AC-8.2 + AC-8.6 conditions) Pass (documentary on IMC 2020) → Verify (aerial nadir cross-domain) C3's contribution is the geometric verification step; paper Table 6 DISK+LightGlue stereo AUC@5°=67.02 (vs SP+LightGlue 59.03 = +7.99 absolute lift); AC-2.1b registration-success rate is DISK+LightGlue's STRONGEST documentary signal vs canonical SP+LightGlue on phototourism stereo. Aerial nadir cross-domain validation required; Jetson MVE measurement on AerialExtreMatch + Derkachi flight
AC-3.3 (≥3 disconnected segments via satellite-reference re-localization) Pass (per-pair stateless) → Verify (recall under perceptual-aliasing + scene-change) DISK+LightGlue's per-pair geometric verification is stateless — applies identically to first-flight + re-localization scenarios. DISK+LightGlue-specific consideration: DISK's dense per-pixel descriptor head (vs SuperPoint's sparse per-keypoint head and ALIKED's SDDH per-keypoint deformable head) provides structural advantage on perceptual-aliasing recovery — RL-policy-gradient training optimizes directly for "many correct feature matches" objective, which is the exact AC-3.3 re-localization recall criterion. Cross-season recall under DISK's MegaDepth-trained weights is unverified on aerial nadir; AerialExtreMatch + D-C2-1 required
AC-4.1 (latency <400 ms p95, end-to-end camera→FC) Verify — MEDIUM-RISK margin at K=10 pairs/frame on TensorRT-equipped path; HARSH-RISK on PyTorch-fp16-only fallback CRITICAL latency-budget interaction: ALIKED paper Table III canonical RTX-2060 throughput for DISK = 11.81 FPS @ 640×480 + 1k keypoints = 84.7 ms per pair extraction-only (vs SuperPoint 52.63 FPS = 19.0 ms; vs ALIKED-N(16) 77.40 FPS = 12.92 ms — DISK is ~6.6× slower than ALIKED-N(16) and ~4.5× slower than SuperPoint at the canonical extraction step). LightGlue matching with adaptive depth adds ~5-10 ms additional on RTX 2060 (per Source #71 paper §5.4 1.86× speedup on easy pairs); total ~90-95 ms per pair on RTX 2060 PyTorch-fp16. CRITICAL Jetson Orin Nano Super extrapolation: Jetson Orin Nano Super has ~1/4× to 1/6× of RTX 2060 throughput → ~500-1500 ms per pair @ 1024 keypoints on PyTorch-fp16-only Jetson (HARSH-RISK; FAILS AC-4.1 at K=1 alone). HOWEVER, LightGlue-ONNX TensorRT pathway PRESENT via Source #73 (30 Jun 2023 changelog "DISK feature extraction support added"; 19 Jul 2023 TensorRT support; 04 Oct 2023 MultiHead-Attention fusion + Fused LightGlue ONNX with FlashAttention-2 up to 80% faster inference; 02 Nov 2023 TopK trick optimizes out ArgMax ~30% speedup) → TensorRT-equipped Jetson Orin Nano Super extrapolation ~50-100 ms per pair @ 1024 keypoints fp16. At K=10 top-K retrieval pairs per UAV frame = 500-1000 ms per UAV frame TensorRT-equipped (TIGHT against AC-4.1 budget) / 1500-4500 ms PyTorch-fp16-only (HARSH-FAIL)AC-4.1 400 ms budget MEDIUM-RISK on TensorRT path; HARSH-FAIL on PyTorch-fp16-only path. D-C3-2 NEW Plan-phase choice for DISK+LightGlue: option (c) ONNX Runtime + TensorRT EP via Source #73 is the DOMINANT runtime path for DISK+LightGlue's competitive Jetson latency (vs PyTorch-fp16-only fallback which fails AC-4.1 at K≥3 pairs); this is the opposite priority ordering vs ALIKED+LightGlue which is forced to PyTorch-fp16-only due to ALIKED-export-absence. D-C3-3 NEW Plan-phase choice (K-pairs-per-frame budget): similarly tight for DISK+LightGlue as for SP+LightGlue; likely requires K reduction from 10 to 3-5 on Jetson PyTorch-fp16-only fallback OR mandatory TensorRT acceleration for K=10 satisfaction. D-C3-5 NEW Plan-phase choice (DISK-pretrained-weights-choice): save-depth.pth canonical default has full documentary IMW2020 + IMC 2020 numbers; save-epipolar.pth is the supplementary-material alternate variant with -0.5 to -1 absolute AUC trade-off vs save-depth.pth per paper §6.2; for the project's pinned UAV-vs-satellite-tile registration use case save-depth.pth is the recommended canonical default
AC-4.2 (memory <8 GB shared) Pass (with Verify) — comparable model footprint to SP+LightGlue + ALIKED+LightGlue DISK 1.092M params + LightGlue 12M params at canonical 9-layer config = ~13.1M params ≈ ~26 MB total weights at fp16 (comparable to SP+LightGlue's ~27 MB + ALIKED+LightGlue's ~26 MB). Activations at 1024×1024 RGB batch=1 ~80-200 MB at fp16 (DISK U-Net dense feature map at multi-scale + per-pixel scoring head + LightGlue self-attention + cross-attention layers ~30-80 MB per layer at 1024 keypoints — DISK's dense feature map slightly larger than ALIKED's SDDH per-keypoint sampling but comparable to SuperPoint's dense interest-point detector). No descriptor-cache pressure at C3 (vs C2 single-stage which has descriptor cache); C3 cache footprint is exactly 0 GB of the 10 GB AC-8.3 cache budget (same as SP+LightGlue + ALIKED+LightGlue). Co-resident memory pressure with C1/C2/C4/C5/C6 manageable — Jetson MVE measurement
AC-8.1 (cache-interface resolution ≥0.5 m/px, ideally 0.3 m/px) Pass (with Verify) — resolution-agnostic at API level DISK is resolution-agnostic at the algorithm level (4-layer U-Net + deformable bottleneck accepts any input size that is multiple of 16; auto-padded preserving aspect ratio via pad_if_not_divisible=True in cvg/LightGlue port; canonical demo evaluates at 1024×1024 + 2k features per python detect.py --height 1024 --width 1024 --n 2048); cross-resolution matching at 0.5 m/px tile GSD vs nav-camera 12 cm/px GSD at 1 km AGL unverified — AerialExtreMatch cross-scale cells are the documentary target
AC-8.6 — Scale-ratio (any UAV-frame ground footprint at deployment altitude must be retrievable) Verify — single-scale matching method At 1 km AGL the nav-camera frame footprint is 470×314 m to 980×655 m; DISK's canonical 1024-largest-edge RGB input is the same as SP+LightGlue + ALIKED+LightGlue. DISK-specific consideration: DISK is a single-scale matching method (paper §3 — does NOT have a documented multi-scale variant like ALIKED-N(16, MS)); for aerial nadir UAV frames vs satellite tiles where scale variation is bounded by AGL altitude × satellite tile GSD ratio (~4× at 1 km AGL × 0.5 m/px), DISK is at-or-near the documented single-scale matching limit per Source #75 ALIKED paper §VI-C2 + Fig. 6 bottom (DISK 4× scale-difference matching accuracy ≈ 30%); multi-scale ALIKED-N(16, MS) handles up to 8× scale difference at the cost of higher inference latency. For the project's bounded scale variation, single-scale DISK should be sufficient but Plan-phase may want to evaluate multi-scale extension at Jetson MVE
AC-8.6 — Scene change in active-conflict sectors Verify — partial geometric robustness via dense per-pixel descriptor head Cratering / building destruction / road realignment is exactly the AerialExtreMatch "scene-change" cell. DISK+LightGlue-specific structural advantage over SP+LightGlue: DISK's per-pixel dense descriptor head (paper §3) provides per-pixel-level geometric matching — the RL-policy-gradient training optimizes directly for "many correct feature matches" surrogate objective, which selects keypoints + descriptors that are structurally robust to local appearance changes (the paper §5 IMW2020 results demonstrate competitive cross-domain matching on phototourism with lighting + viewpoint variations). DISK+LightGlue's structural defense against scene-change is comparable to SP+LightGlue's (similar dense descriptor extraction paradigm) and slightly weaker than ALIKED+LightGlue's deformable per-keypoint head (paper Eq. 4-5 SDDH adaptive geometric-invariance modeling). AerialExtreMatch + D-C2-1 retrain decision required
AC-8.6 — Compute & latency under steady-state and re-loc-trigger Verify — MEDIUM-RISK margin under steady-state on TensorRT-equipped path DISK+LightGlue's per-pair compute is variable (LightGlue adaptive-depth + adaptive-width pruning) — same advantage as SP+LightGlue + ALIKED+LightGlue (paper §5.4 1.86× speedup on easy pairs / 1.16× on hard / 1.45× average). DISK-specific consideration: LightGlue-ONNX TensorRT acceleration multiplier (~3-5× speedup over PyTorch fp16) is the critical D-C3-2 deployment-runtime decision lever for DISK+LightGlue's competitive Jetson latency story; without TensorRT, DISK's 98.97 GFLOPs at 640×480 + 1k keypoints would make DISK+LightGlue uncompetitive on Jetson Orin Nano Super at K=10 pairs/frame. Steady-state UAV operation has many high-overlap pairs (consecutive UAV frames overlap at 1 km AGL with low altitude-variability) → adaptive-depth advantage compounds with TensorRT acceleration. Re-loc-trigger workload after AC-3.3 disconnection has more cross-season + cross-time hard pairs → adaptive-depth advantage is reduced; combined with DISK's 98.97 GFLOPs raw cost → MEDIUM-RISK D-C3-3 K-pairs-per-frame budget gate for DISK+LightGlue — TensorRT path keeps DISK+LightGlue competitive but not dominant on Jetson
AC-NEW-2 (spoofing-promotion latency <3 s p95) Pass (single-pair latency comfortable on TensorRT path) → Verify (multi-pair re-anchor latency) Single-pair latency budget very comfortable — DISK+LightGlue per-pair at TensorRT-equipped fp16 (~50-100 ms standard / 25-50 ms adaptive on Jetson Orin Nano Super extrapolation) << 3 s budget (~30-60× under). Multi-pair re-anchor latency at K=10 pairs: 500-1000 ms standard / 250-500 ms adaptive — comfortably within 3 s budget on TensorRT path. DISK+LightGlue-specific consideration: paper Table 6 IMC 2020 stereo documentary evidence DISK+LightGlue AUC@5°=67.02 (vs SP+LightGlue 59.03 = +7.99 absolute) demonstrates strongest re-anchor reliability vs canonical SP+LightGlue on phototourism stereo; transitive lineage suggests strong re-anchor reliability if C2 retrieval delivers high-recall top-K with cross-season + cross-time scenes
AC-NEW-6 (imagery freshness — never satellite_anchored on stale-tile match) Pass (mechanical) DISK+LightGlue produces 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair; freshness-age decision is a downstream C5/C6 filter on the (tile-id, match-success, inlier-count) tuple. No structural interaction with freshness — same as SP+LightGlue + ALIKED+LightGlue rows
AC-NEW-7 (cache-poisoning safety budget — P(>30 m geo-misalign) <1%, P(>100 m) <0.1%) Pass — STRUCTURAL geometric-verification advantage over C2 single-stage retrieval Same as SP+LightGlue + ALIKED+LightGlue rows — C3's per-correspondence confidence threshold τ=0.1 + soft partial assignment matrix + downstream C4 PnP+RANSAC inlier selection provides the structural geometric-verification layer that catches mid-flight-written misaligned tiles (AC-8.4); rejects poisoned-but-misaligned tiles via low-inlier-count or high-residual-error at the RANSAC step. DISK+LightGlue-specific consideration: paper Table 6 IMC 2020 stereo +7.99 absolute AUC@5° lift over SP+LightGlue → stronger structural cache-poisoning defense via better geometric verification accuracy at the AUC level; partially offset by paper Table III -6.66 absolute MHA@3 on HPatches homography accuracy vs ALIKED-N(16) (DISK's evenly-distributed dense keypoints give weaker per-pixel geometric verification than ALIKED's deformable per-keypoint head, but DISK has more matches per pair which allows higher inlier counts at the same precision threshold)
Restriction "Operational area: eastern/southern Ukraine" — sparse-matcher train-domain match ⚠️ Documentary gap → Verify (D-C2-1 reuse + LOW retrain-friendliness vs ALIKED + WELL-DOCUMENTED retrain workflow) Canonical DISK+LightGlue weights are pre-trained on EPFL CVLab DISK dataset (~164 GB) sampled from MegaDepth phototourism scenes with depth-map supervision — same caveat as SP+LightGlue + ALIKED+LightGlue + C2 candidates; D-C2-1 retrain decision applies to DISK+LightGlue identically. DISK+LightGlue-specific consideration: LOW retrain-friendliness vs ALIKED at the GPU-memory level (canonical RL-policy-gradient training takes ~2 weeks on 32 GB V100s OR ~2 weeks at smaller batch on 12 GB low-memory variant per Source #76 README + Source #77 paper §6.4 — vs ALIKED's ~24 hours on RTX 3090 per Source #75 paper §V sparse NRE loss memory advantage); HOWEVER, well-documented retrain workflow — canonical colmap/colmap2dataset.py workflow allows direct import from COLMAP-processed AerialVL or Derkachi-flight scenes (much better-documented than SP-reproduction which would require Magic-Leap's Homographic Adaptation training pipeline + LICENSE clearance). For D-C2-1 = (a) project-domain retrain decision on aerial nadir corpus, DISK is materially more expensive than ALIKED to retrain but materially cheaper than SP-reproduction. AerialExtreMatch + Derkachi flight required
Restriction "Altitude ≤1 km AGL; terrain assumed flat (rolling steppe / agricultural)" — sparse-matcher scale band match Verify Same as AC-8.6 scale-ratio row; cross-scale matching at the project's altitude band is the AerialExtreMatch cross-scale cell; DISK is single-scale matching method (no documented multi-scale variant); for the project's bounded scale variation (~4× at 1 km AGL × 0.5 m/px) DISK is at-or-near the documented single-scale matching limit
Restriction "Weather: predominantly sunny ... seasonal/visibility classes" — sparse-matcher cross-season generalization Verify (DOCUMENTARY EVIDENCE on phototourism stereo from paper Table 6) Cross-season matching is the dominant aerial-cross-domain failure mode; canonical DISK+LightGlue weights are MegaDepth-perspective-trained — D-C2-1 is the primary lever. DISK+LightGlue-specific finding: paper Appendix A Table 6 IMC 2020 stereo documentary evidence DISK+LightGlue AUC@5°=67.02 + AUC@10°=83.45 (vs SP+LightGlue 59.03/77.96 = +7.99/+5.49 absolute lifts) — STRONGEST documentary cross-illumination geometric robustness vs canonical SP+LightGlue on phototourism stereo. ALIKED paper Source #75 Aachen Day-Night documentary at 2048 keypoints / mNN gives DISK 70.4/82.7/94.9 (vs SuperPoint 69.4/78.6/87.8 = +1.0/+4.1/+7.1 absolute over SuperPoint at strictest tier) — DISK is comparable to SuperPoint on Aachen Day-Night but stronger on phototourism stereo, while ALIKED-N(32) is stronger on Aachen Day-Night. Aerial nadir cross-season + cross-conflict validation unverified — D-C2-1 retrain decision + AerialExtreMatch + Derkachi flight required
Restriction "Navigation camera (pinned): ADTi 20MP, 5472×3648" Pass (API) — same downscale as canonical DISK+LightGlue consumes any 1024-largest-edge RGB input that is multiple of 16 (auto-padded via pad_if_not_divisible=True); the 5472×3648 → 1024×683 downscale (auto-padded to 1024×688 = 1024×688 multiple-of-16) is same aggressiveness as SP+LightGlue + ALIKED+LightGlue. D-C2-3 input-resolution-shape Plan-phase decision applies identically. Algorithm is resolution-agnostic at API level — preprocess_conf={"resize": 1024} is exposed in canonical DISK extractor; project may choose 1280 or 1536 at Jetson MVE time at proportional latency cost (1280 = ~1.6× compute; 1536 = ~2.25× compute)
Restriction "Satellite Imagery — resolution ≥0.5 m/px" — sparse-matcher pipeline at AC-8.1 floor Verify Same as AC-8.1
Restriction "Satellite Imagery — Cache budget: 10 GB" — sparse-matcher cache footprint Pass — NO C3 cache footprint C3 cache footprint is exactly 0 GB — same as SP+LightGlue + ALIKED+LightGlue; DISK+LightGlue operates on UAV-frame + retrieved-tile pair on-the-fly with no pre-cached match-time state. Model weights ~26 MB at fp16 = 0.26% of cache budget loaded once at boot
Restriction "Companion computer: Jetson Orin Nano Super, 8 GB shared" Verify — MEDIUM-RISK D-C3-2 Jetson export-pathway gate (TensorRT-equipped path PRESENT vs ALIKED's TensorRT-pathway-ABSENT) + 98.97-GFLOPS-HIGHEST-raw-compute-cost CAVEAT CRITICAL D-C3-2 finding for DISK+LightGlue: Source #73 LightGlue-ONNX SHIPS documented DISK end-to-end ONNX/TensorRT pipeline — Source #73 README changelog 30 Jun 2023 entry "DISK feature extraction support added"; CLI commands parallel SP+LightGlue export with lightglue-onnx export disk_lightglue --num-keypoints 1024 -b 2 -h 1024 -w 1024 --fp16 --device cuda; full TensorRT support via 19 Jul 2023 changelog + MultiHead-Attention fusion + FlashAttention-2 fused ONNX (04 Oct 2023, up to 80% faster inference on long-keypoint sequences) + TopK trick (02 Nov 2023, ~30% speedup). HOWEVER, DISK 98.97 GFLOPs HIGHEST raw-compute-cost among modern competitive sparse extractors (24.4× higher than ALIKED-N(16); 3.8× higher than SuperPoint) → DISK+LightGlue Jetson runtime path strongly PREFERS (c) ONNX Runtime + TensorRT EP via Source #73 (DOMINANT path; ~50-100 ms per pair Jetson Orin Nano Super extrapolation, 3-5× speedup over PyTorch fp16); (d) pure TensorRT via lightglue-onnx trtexec Polygraphy-based pathway as alternate (similar latency profile); PyTorch-fp16-only fallback (~200-400 ms per pair) FAILS AC-4.1 budget at K=10 pairs/frame. Steady-state co-resident memory + GPU-time with C1 + C2 + C4 + C5 + C6 manageable — model footprint advantage compounds (~26 MB at fp16 = comparable to SP+LightGlue + ALIKED+LightGlue)
Restriction "License posture (D-C1-1)" — sparse-matcher license-track interaction POSITIVE finding (FULLY-CLEAN-APACHE-2.0 license track THROUGHOUT) — D-C3-1 RECOMMENDED-PRIMARY-MITIGATION role POSITIVE on canonical cvlab-epfl/disk: Source #76 GitHub API license metadata = Apache-2.0 (license.spdx_id: "Apache-2.0") — permissive, BSD/permissive license track. POSITIVE on cvg/LightGlue matcher: Source #70 LICENSE = Apache-2.0 (Copyright 2023 ETH Zurich) — permissive, BSD/permissive license track. POSITIVE on kornia integration layer: kornia is well-established Apache-2.0 — permissive, BSD/permissive license track. FULLY CLEAN APACHE-2.0 LICENSE TRACK THROUGHOUT — no Magic Leap noncommercial-research disqualifier (vs SP+LightGlue), no GPL-3.0 copyleft (vs SALAD on C2 row), no BSD-3-Clause + Apache-2.0 mixed track (vs ALIKED+LightGlue). CLEANEST license-compliant LightGlue-extractor-sibling in the project's evaluated C3 candidate space. Under D-C1-1 = (a) GPL-3.0 track, (b) BSD/permissive lock, or (c) keep-both-tracks-open, DISK+LightGlue is eligible on every license-posture choice with the simplest license-compliance story (single Apache-2.0 license throughout). D-C3-1 RECOMMENDED-PRIMARY-MITIGATION role — DISK+LightGlue is the cleanest license-compliant + technically-superior + Jetson-deployment-ready C3 choice vs canonical SP+LightGlue's Magic-Leap-restrictive HARD-DISQUALIFIER and ALIKED+LightGlue's PyTorch-fp16-only Jetson runtime restriction. Three converging POSITIVE structural advantages: (i) FULLY CLEAN APACHE-2.0 license track throughout; (ii) PAPER APPENDIX A TABLE 6 +7.99 ABSOLUTE AUC@5° + +5.49 ABSOLUTE AUC@10° on IMC 2020 stereo over canonical SP+LightGlue (strongest documentary technical-superiority signal in the project's evaluated C3 candidate space); (iii) LIGHTGLUE-ONNX TENSORRT EXPORT PATHWAY PRESENT (Source #73 30 Jun 2023 changelog). Recommendation: present D-C1-1 + D-C3-1 + this row to user as a structured Choose block at Plan time; DISK+LightGlue is the cleanest license-compliant + technically-superior + Jetson-deployment-ready C3 choice with the trade-off of HIGH retrain cost (~2 weeks on 32 GB V100 vs ALIKED's ~24 hours on RTX 3090) and HIGHEST raw-GFLOPs cost (98.97 GFLOPs vs ALIKED-N(16)'s 4.05 GFLOPs, partially mitigated via TensorRT acceleration)

Fact #50 — SuperGlue+SuperPoint per-mode API capability verification (canonical magicleap/SuperGluePretrainedNetwork attentional graph neural network sparse matcher + canonical magicleap/SuperPointPretrainedNetwork keypoint extractor; MANDATORY SIMPLE-BASELINE role per engine Component Option Breadth rule on Jetson Orin Nano Super) — DOCUMENTARY PASS WITH MAGIC-LEAP-RESTRICTIVE-LICENSE-HARD-DISQUALIFIER (BYTE-FOR-BYTE IDENTICAL TO SOURCE #72) + TRAINING-CODE-NOT-RELEASED BLOCKS-D-C2-1-RETRAIN + 4-10× SLOWER THAN LIGHTGLUE PER SOURCE #71 PAPER §5 + AERIAL-DOMAIN-TRAINING-CAVEAT (D-C2-1 REUSE) + INFERENCE-ONLY-CODEBASE; closes C3 mandatory pre-screen at 4/N (mandatory simple-baseline role)

  • Statement: SuperGlue+SuperPoint (magicleap/SuperGluePretrainedNetwork CVPR 2020 Oral; canonical implementation by Paul-Edouard Sarlin + Daniel DeTone + Tomasz Malisiewicz + Andrew Rabinovich, Magic Leap; paired exclusively with canonical Magic Leap SuperPoint extractor per Source #78 README "We do not intend to release the SIFT-based or homography SuperGlue models" — both inherit the Magic Leap restrictive license disqualifier from Source #72 + Source #78) is the MANDATORY SIMPLE-BASELINE reference for the C3 row per the engine Component Option Breadth rule — the long-established graph-neural-network sparse-matcher reference baseline that defines the simple-baseline floor against which modern leads (LightGlue, XFeat, etc.) must measurably exceed. SuperGlue+SuperPoint is NOT a Selected candidate for the project's deployment because: (i) HARD LICENSE DISQUALIFIER — Source #78 LICENSE wording is byte-for-byte identical to Source #72 SuperPoint LICENSE = Magic Leap "ACADEMIC OR NON-PROFIT ORGANIZATION NONCOMMERCIAL RESEARCH USE ONLY" Software License Agreement, blocks dual-use deployment in eastern/southern Ukraine fixed-wing UAV with AC-NEW-2 spoofing-promotion path; (ii) TRAINING CODE NOT RELEASED — Source #78 README explicitly states "We do not intend to release the SuperGlue training code", blocking D-C2-1 retrain decision for SuperGlue+SuperPoint pinned mode; (iii) 4-10× SLOWER THAN LIGHTGLUE at competitive but slightly lower accuracy per Source #71 LightGlue paper §5 + Table 2 documentary evidence (LightGlue paper §1 explicitly positions LightGlue as the displacement of SuperGlue in the canonical NetVLAD top-K → sparse matcher → PnP+RANSAC pipeline shape); (iv) NO ALTERNATIVE EXTRACTOR PAIRING — paired exclusively with canonical Magic Leap SuperPoint extractor (no SIFT or homography variants released per Source #78 README); (v) NO STRUCTURAL ADVANTAGES OVER LIGHTGLUE — no FlashAttention support, no adaptive-depth/adaptive-width pruning (LightGlue paper §3.3), no productized Jetson ONNX/TensorRT export pathway in the LightGlue-ONNX equivalent project; (vi) NO STRUCTURAL ADVANTAGES OVER ALIKED+LIGHTGLUE OR DISK+LIGHTGLUE — same Magic Leap restrictive license HARD DISQUALIFIER as canonical SP+LightGlue, but worse runtime AND no retrain capability AND no LightGlue-ONNX-style TensorRT pathway. Per the per-Mode API Capability Verification rule, the project's pinned mode is the (SuperPoint MagicLeap-pretrained extractor at 1024×1024 grayscale → up to 1024 keypoints with 256-D descriptors and per-keypoint confidence scores) + (SuperGlue matcher with superglue='outdoor' MegaDepth-trained checkpoint, nms_radius=3, match_threshold=0.2) → up to 1024 2D-2D correspondences with confidence scores feeding the project's downstream C4 PnP+RANSAC pose estimator. The canonical inference pipeline differs from cvg/LightGlue's Python-API one-liner — SuperGlue uses the match_pairs.py CLI script with a text-file of image-pair paths, producing .npz files with keys {keypoints0, keypoints1, matches, match_confidence}. Two separately-cataloged SuperGlue pretrained-weights sibling modes documented in Source #78: superglue_indoor.pth (ScanNet-trained; recommended config --resize 640 --superglue indoor --max_keypoints 1024 --nms_radius 4; documentary results AUC@5/10/20=16.12/33.76/51.79 on ScanNet 1500-pair test) + superglue_outdoor.pth (MegaDepth-trained; recommended config --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3 --resize_float; documentary results AUC@5/10/20=39.02/59.51/75.72 on YFCC 4000-pair test). Mode-enumeration query (1/3) — context7 NOT INDEXED + WebFetch fallback PASS: context7 resolve-library-id returned no relevant matches for "SuperGlue" feature matcher (top-result was Superglue API orchestration which is unrelated to feature-matching); per Per-Mode API Capability Verification rule item 2, fall-back to official-docs WebFetch on the canonical magicleap/SuperGluePretrainedNetwork README + LICENSE + GitHub API license metadata was used (Source #78) plus canonical paper WebFetch (Source #79). Pinned-mode runnable example query (2/3) — WebFetch PASS: Source #78 README ships canonical match_pairs.py CLI demo ./match_pairs.py --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3 --resize_float --input_dir assets/phototourism_sample_images/ --input_pairs assets/phototourism_sample_pairs.txt --output_dir dump_match_pairs_outdoor --viz for outdoor matching; output format documented as .npz files with {keypoints0: (N0, 2), keypoints1: (N1, 2), matches: (N0,) array of indices into keypoints1 with -1 for unmatched, match_confidence: (N0,)} per README example. Disqualifier-probe query (3/3) — TWO HARD-DISQUALIFIERS SURFACED: (a) Magic Leap restrictive LICENSE byte-for-byte identical to Source #72 — Source #78 LICENSE wording is "ACADEMIC OR NON-PROFIT ORGANIZATION NONCOMMERCIAL RESEARCH USE ONLY" with identical clauses prohibiting commercial / dual-use deployment; (b) TRAINING CODE NOT RELEASED — Source #78 README "We do not intend to release the SuperGlue training code" — D-C2-1 retrain decision is STRUCTURALLY BLOCKED for SuperGlue+SuperPoint pinned mode at the project level, unlike SP+LightGlue (where LightGlue training code IS released and SP-reproduction with permissive license is a documented mitigation pathway). Did NOT surface any documented frame-rate floor; did NOT surface any documented memory ceiling at the algorithm level beyond the standard SuperGlue+SuperPoint footprint (SuperPoint ~5 MB at fp16 + SuperGlue ~50 MB at fp16 = ~55 MB total — slightly larger than LightGlue's 12 MB matcher); did NOT surface any Jetson Orin Nano measurement directly (similar to all C-row components). Three NEGATIVE structural findings vs all LightGlue siblings: (vii) 4-10× SLOWER THAN LIGHTGLUE per Source #71 paper §5 + Table 2 — LightGlue paper Table 2 documents SP+LightGlue MegaDepth-1500 AUC@5°/10°/20°=66.7/79.3/87.9 at 44.2 ms standard / 31.4 ms adaptive RTX 3080 vs SP+SuperGlue at slightly lower AUC + 4-10× slower runtime (e.g., SP+SuperGlue at ~150-200 ms RTX 3080 standard); on Jetson Orin Nano Super extrapolation SP+SuperGlue would be ~600-1200 ms per pair fp16 = catastrophic AC-4.1 FAIL even at K=1 pair/frame; (viii) NO FLASHATTENTION SUPPORT — SuperGlue's attention layers do not support FlashAttention-2 (LightGlue's structural advantage with up to 80% faster inference per Source #73 04 Oct 2023 changelog); (ix) NO ADAPTIVE-DEPTH/ADAPTIVE-WIDTH PRUNING — SuperGlue is fixed-depth 9-layer GNN (LightGlue's structural advantage paper §3.3 with ~33% average inference-time reduction at <1% accuracy loss, up to 1.86× speedup on easy pairs). Pinned-mode sentence: "We will catalog SuperGlue+SuperPoint with canonical SuperPoint MagicLeap-pretrained extractor + SuperGlue matcher with superglue='outdoor' MegaDepth-trained checkpoint as the MANDATORY SIMPLE-BASELINE reference per engine Component Option Breadth rule — establishes the long-established sparse-matcher reference floor against which modern leads (LightGlue, XFeat) must measurably exceed; NOT a Selected candidate due to (a) Magic Leap restrictive license HARD DISQUALIFIER, (b) training-code-not-released blocking D-C2-1 retrain, (c) 4-10× slower than LightGlue per Source #71 paper §5 documentation. Inputs {1× ADTi 20MP nav frame stream → grayscale-converted + bilinearly downscaled-to-largest-edge 1024 + 1× cached satellite tile per top-K retrieval result from C2}; expected outputs {up to 1024 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair feeding C4 PnP+RANSAC}; runtime Jetson Orin Nano Super (8 GB shared, JetPack 6, ROS 2 Humble)deployment-ready ONLY in noncommercial-research mode (license blocks dual-use deployment); for project's actual deployment use D-C3-1 RECOMMENDED-PRIMARY-MITIGATION = (a) DISK+LightGlue per Fact #49 instead." MANDATORY-SIMPLE-BASELINE role per engine Component Option Breadth rule — SuperGlue+SuperPoint is the CANONICAL sparse-matcher mandatory-simple-baseline reference for the C3 row, structurally analogous to NetVLAD's mandatory-simple-baseline role in the C2 row (per Fact #45). The role's purpose is to establish the long-established reference floor against which modern leads must measurably exceed at deployment-ready license + Jetson-friendly runtime + retrain-capable training; SuperGlue+SuperPoint fails on all three deployment axes (HARD LICENSE DISQUALIFIER + NO RETRAIN + 4-10× SLOWER THAN LIGHTGLUE) but succeeds at the role of being the documented reference baseline that LightGlue + XFeat measurably exceed.
  • Source: Source #78 (magicleap/SuperGluePretrainedNetwork canonical README + LICENSE + GitHub API license metadata — Magic Leap restrictive license byte-for-byte identical to Source #72; two pretrained checkpoints superglue_indoor.pth + superglue_outdoor.pth; canonical inference CLI ./match_pairs.py --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3 --resize_float; TRAINING CODE NOT RELEASED; NO SIFT/homography variants released; documentary results ScanNet 1500-pair test AUC@5/10/20=16.12/33.76/51.79 + YFCC 4000-pair test AUC@5/10/20=39.02/59.51/75.72; hloc integration cross-reference; CVPR 2020 Oral + 3 CVPR 2020 competition wins), Source #79 (canonical paper arXiv:1911.11763 / Sarlin et al. CVPR 2020 Oral — §3 architecture [Attentional Graph Neural Network with self-attention + cross-attention + Optimal Matching Layer with dustbin handling + Sinkhorn algorithm for differentiable optimal transport assignment] + §4 training [end-to-end with sparse keypoint correspondence supervision; ScanNet indoor + MegaDepth outdoor models] + §5 experiments [ScanNet Table 1 + YFCC Table 2 + Phototourism Table 3 + HPatches Table 4]), Source #72 cross-cite (canonical SuperPoint LICENSE — byte-for-byte identical Magic Leap restrictive wording confirms HARD DISQUALIFIER applies to BOTH SuperPoint extractor weights + SuperGlue matcher), Source #71 cross-cite (cvg/LightGlue canonical paper §5 + Table 2 — LightGlue is 4-10× faster than SuperGlue at competitive accuracy; LightGlue paper §1 explicitly positions LightGlue as the displacement of SuperGlue in the canonical NetVLAD top-K → sparse matcher → PnP+RANSAC pipeline shape), Source #73 cross-cite (fabio-sim/LightGlue-ONNX companion — NO PRODUCTIZED SuperGlue ONNX/TensorRT export pathway; LightGlue-ONNX repo supports SP+LightGlue + DISK+LightGlue extractors only; SuperGlue ONNX export is community-maintained third-party, not productized — confirms LightGlue's structural Jetson-deployment advantage over SuperGlue)
  • Phase: Phase 2
  • Target Audience: System architects + C3 implementer + Step-7.5 reviewer + license-posture decision-maker (D-C1-1 + D-C3-1 — Magic Leap restrictive HARD DISQUALIFIER applies same as canonical SP+LightGlue) + Plan-phase architect (mandatory-simple-baseline role documentation for engine Component Option Breadth rule compliance)
  • Confidence: for mode-enumeration (two canonical pretrained-weights sibling modes superglue_indoor.pth + superglue_outdoor.pth + SuperPoint extractor pairing wired in canonical match_pairs.py CLI), runnable-example (canonical match_pairs.py + demo_superglue.py runnable inference scripts in Source #78 with explicit recommended configs for indoor + outdoor + sample-pair evaluation), license (Magic Leap restrictive byte-for-byte identical to Source #72 confirmed via Source #78 LICENSE WebFetch + GitHub API license.spdx_id: "NOASSERTION"); for documentary ScanNet + YFCC AUC numbers (per Source #78 README evaluation tables); for TRAINING CODE NOT RELEASED finding (Source #78 README explicit statement "We do not intend to release the SuperGlue training code"); for 4-10× SLOWER THAN LIGHTGLUE finding (per Source #71 paper §5 + Table 2 documentary evidence); for NO PRODUCTIZED Jetson ONNX/TensorRT export pathway (Source #73 LightGlue-ONNX repo supports SP+LightGlue + DISK+LightGlue only, not SuperGlue); for HARD LICENSE DISQUALIFIER for project's dual-use deployment context (same Magic Leap restrictive wording as Source #72 = same disqualifier reasoning per Fact #47 + project's question_decomposition.md hard disqualifier list); ⚠️ for Jetson Orin Nano Super deployment latency / memory / accuracy (no documentary measurement; extrapolation from RTX 3080 SuperGlue ~150-200 ms standard at 1024 keypoints suggests catastrophic AC-4.1 FAIL even at K=1 pair/frame on Jetson Orin Nano Super — this is the SECOND structural disqualifier on top of the license disqualifier); for canonical-checkpoint aerial-domain fitness (canonical training on ScanNet indoor + MegaDepth phototourism outdoor — NOT aerial nadir; same caveat as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue + C2 candidates, D-C2-1 reuse — but D-C2-1 is STRUCTURALLY BLOCKED for SuperGlue+SuperPoint since training code is not released); for project deployment-readiness (license + retrain + runtime ALL three deployment axes blocked — confirms mandatory-simple-baseline-only role per engine Component Option Breadth rule)
  • Related Dimension: SQ3+SQ4 / C3 mandatory simple-baseline reference (engine Component Option Breadth rule role) — per-mode API capability verification gate
  • Fit Impact: DOCUMENTARY PASS for the per-mode API capability verification gate ONLY at the mandatory-simple-baseline role — SuperGlue+SuperPoint has a documented runnable per-mode example with the project's pinned configuration (canonical magicleap/SuperGluePretrainedNetwork + canonical SuperPoint pretrained weights + canonical paper algorithmic specification), two documented SuperGlue pretrained-weights sibling modes (superglue_indoor.pth ScanNet-trained / superglue_outdoor.pth MegaDepth-trained), and no API-level disqualifier. HOWEVER, three converging HARD-DISQUALIFIERS for project deployment: (i) HARD LICENSE DISQUALIFIER — Source #78 LICENSE byte-for-byte identical to Source #72 = Magic Leap noncommercial-research-only SLA, blocks dual-use deployment; (ii) TRAINING CODE NOT RELEASED — Source #78 README explicitly blocks D-C2-1 retrain decision for SuperGlue+SuperPoint pinned mode (no project-side mitigation pathway exists, unlike SP+LightGlue where LightGlue training code IS released and SP-reproduction is a documented mitigation); (iii) 4-10× SLOWER THAN LIGHTGLUE at competitive but slightly lower accuracy per Source #71 paper §5 + Table 2 — Jetson Orin Nano Super extrapolation puts SP+SuperGlue at ~600-1200 ms per pair fp16 = catastrophic AC-4.1 FAIL even at K=1 pair/frame. POSITIVE for the role: SuperGlue+SuperPoint IS the canonical sparse-matcher mandatory-simple-baseline reference that the engine's Component Option Breadth rule requires to be cataloged; the role's purpose is to establish the long-established reference floor against which modern leads (LightGlue, XFeat) must measurably exceed at deployment-ready license + Jetson-friendly runtime + retrain-capable training. No new Plan-phase decision raised by SuperGlue+SuperPoint closure (the mandatory-simple-baseline role is structural, does not require a separate Plan-phase decision; the project's deployment will not select SuperGlue+SuperPoint regardless of D-C1-1 license-posture choice because TRAINING-CODE-NOT-RELEASED + 4-10×-SLOWER are independent disqualifiers from the license disqualifier). NO REUSE of D-C2-1 retrain decision for SuperGlue+SuperPoint pinned mode — D-C2-1 is STRUCTURALLY BLOCKED by training-code-not-released per Source #78 README. NO REUSE of D-C3-2 Jetson runtime path choice for SuperGlue+SuperPoint pinned mode — Source #73 LightGlue-ONNX repo does NOT support SuperGlue end-to-end ONNX/TensorRT export; SuperGlue's only Jetson runtime path is PyTorch-fp16 (catastrophic AC-4.1 FAIL) or third-party community ONNX exports (operationally complex, not productized). C3 mandatory pre-screen status: SuperGlue+SuperPoint closes the C3 mandatory pre-screen at 4 of N candidates (SP+LightGlue at 1/N + ALIKED+LightGlue at 2/N + DISK+LightGlue at 3/N + SuperGlue+SuperPoint mandatory-simple-baseline at 4/N this session). The mandatory-simple-baseline role is STRUCTURALLY COMPLETE for the C3 row per the engine Component Option Breadth rule — no further mandatory-simple-baseline candidates required. License: Magic Leap restrictive for both canonical SuperPoint extractor (Source #72) AND canonical SuperGlue matcher (Source #78) = byte-for-byte identical HARD DISQUALIFIER for project's dual-use deployment context; the canonical SuperGlue+SuperPoint pinned mode is excluded from any Selected status regardless of D-C1-1 license-posture choice. Position vs all prior C3 candidates: SuperGlue+SuperPoint is strictly inferior to SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue on every project-relevant deployment axis except for the mandatory-simple-baseline reference role — confirms the engine's Component Option Breadth rule's purpose: cataloging the simple-baseline FORCES the modern leads (DISK+LightGlue + ALIKED+LightGlue + SP+LightGlue) to measurably exceed it on documented-evidence axes (4-10× speedup, training-code-released for retrain capability, and either Apache-2.0 throughout (DISK) or BSD-3-Clause + Apache-2.0 mixed (ALIKED) license-track placement). The project's actual deployment will use D-C3-1 RECOMMENDED-PRIMARY-MITIGATION = (a) DISK+LightGlue per Fact #49.

C3 — Per-Mode API Capability Verification (engine Step 2 — SuperGlue+SuperPoint mandatory simple-baseline session entry, 2026-05-08)

MVE — SuperGlue+SuperPoint with superglue='outdoor' MegaDepth-trained checkpoint + 1024 keypoints + 256-D descriptors @ 1024×1024 grayscale → up to 1024 2D-2D correspondences (canonical mandatory-simple-baseline reference; superglue_indoor.pth ScanNet-trained / superglue_outdoor.pth MegaDepth-trained documented as separately-cataloged sibling pretrained-weights modes; NO Plan-phase decision required since mandatory-simple-baseline role is NOT a Selected candidate path)

  • Source: Source #78 (magicleap/SuperGluePretrainedNetwork canonical README + LICENSE + GitHub API license metadata — ./match_pairs.py --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3 --resize_float --input_dir assets/phototourism_sample_images/ --input_pairs assets/phototourism_sample_pairs.txt --output_dir dump_match_pairs_outdoor --viz for canonical pretrained outdoor inference; ./match_pairs.py --resize 640 --superglue indoor --max_keypoints 1024 --nms_radius 4 for indoor; two pretrained checkpoints superglue_indoor.pth + superglue_outdoor.pth distributed in-tree under models/weights/; TRAINING CODE NOT RELEASED; NO SIFT/homography variants released), accessed 2026-05-08; Source #79 (canonical paper arXiv:1911.11763 / Sarlin et al. CVPR 2020 Oral — §3 architecture, §4 training, §5 experiments); Source #72 cross-cite (canonical SuperPoint LICENSE — same Magic Leap restrictive HARD DISQUALIFIER applies); Source #71 cross-cite (cvg/LightGlue paper §5 + Table 2 — LightGlue is 4-10× faster than SuperGlue at competitive accuracy; LightGlue paper §1 explicitly positions LightGlue as the displacement of SuperGlue); Source #73 cross-cite (fabio-sim/LightGlue-ONNX companion — NO PRODUCTIZED SuperGlue ONNX/TensorRT export pathway confirmed)
  • Inputs in the example: Two arbitrary RGB or grayscale images at any (independent) resolutions; canonical demo uses Phototourism sample images at native resolution; SuperPoint extractor crops grayscale input into per-image dict {keypoints: (N, 2), scores: (N,), descriptors: (256, N)} where N ≤ max_keypoints (canonical default 1024 for indoor, 2048 for outdoor; project pinned to 1024); SuperGlue matcher input: dict with image0 and image1 keys mapping to per-image SuperPoint output dicts; output: {matches0: (N0,) array of indices into keypoints1 with -1 for unmatched, matches1: (N1,) array of indices into keypoints0 with -1 for unmatched, matching_scores0: (N0,), matching_scores1: (N1,)}; canonical match_pairs.py CLI dumps .npz files with simplified format {keypoints0, keypoints1, matches, match_confidence} for downstream processing. CRITICAL: SuperGlue+SuperPoint pipeline operates on grayscale input (vs SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue which auto-convert grayscale→RGB) — preserved canonical SuperPoint requirement (single channel)
  • Outputs in the example: Up to 1024 2D-2D correspondences with per-correspondence confidence score; canonical README documentary results: ScanNet 1500-pair test (indoor model) AUC@5/10/20 = 16.12/33.76/51.79, Prec=84.37, MScore=31.14; YFCC 4000-pair test (outdoor model) AUC@5/10/20 = 39.02/59.51/75.72, Prec=98.72, MScore=23.61; sample images (15 pairs) AUC@5/10/20=26.99/48.40/64.47. By cross-paper Source #71 LightGlue paper Table 2 cross-cite: LightGlue is 4-10× faster than SuperGlue at competitive accuracy (LightGlue paper §1 explicitly positions LightGlue as the SuperGlue displacement)
  • Project inputs: 1× ADTi 20MP nav frame stream (5472×3648, target 3 fps) → grayscale-converted (canonical SuperPoint requirement) → bilinearly downscaled-to-largest-edge 1024 → fp16 batch on Jetson Orin Nano Super; per-UAV-frame K=10 top-K retrieved satellite tiles from C2 → grayscale → 1024-largest-edge → fp16; NOTE: project will NOT actually deploy SuperGlue+SuperPoint (mandatory-simple-baseline role; HARD-LICENSE-DISQUALIFIER + 4-10×-SLOWER + TRAINING-CODE-NOT-RELEASED); cataloged for engine Component Option Breadth rule compliance only
  • Project outputs required: Up to 1024 2D-2D correspondences per (UAV-frame, satellite-tile) image pair with confidence scores at the mandatory-simple-baseline reference role — establishes the long-established reference floor against which modern leads (DISK+LightGlue + ALIKED+LightGlue + SP+LightGlue) must measurably exceed at: (a) deployment-ready license, (b) Jetson-friendly runtime, (c) retrain-capable training. CATASTROPHIC LATENCY-BUDGET FAIL: SP+SuperGlue ~150-200 ms per pair on RTX 3080 fp16 (per Source #71 paper §5 + Table 2 cross-cite of LightGlue 4-10× speedup) → Jetson Orin Nano Super extrapolation ~600-1200 ms per pair fp16 = AC-4.1 FAIL even at K=1 pair/frame (vs 400 ms budget); at K=10 pairs/frame = 6-12 seconds = catastrophic fail. NOT a Selected candidate path — cataloged at mandatory-simple-baseline role only
  • Match assessment: exact mode match for (SuperPoint MagicLeap-pretrained extractor at 1024×1024 grayscale input, 1024 max keypoints, 256-D descriptors, SuperGlue matcher with superglue='outdoor' MegaDepth-trained checkpoint, nms_radius=3, match_threshold=0.2, up to 1024 2D-2D correspondences output with confidence scores); inference CLI (match_pairs.py + demo_superglue.py) exists in canonical magicleap/SuperGluePretrainedNetwork (Source #78); two pretrained-weights sibling modes documented (superglue_indoor.pth / superglue_outdoor.pth); companion cvg/Hierarchical-Localization (hloc) ships canonical NetVLAD top-50 → SuperPoint+SuperGlue → PnP+RANSAC pipeline at the predecessor pipeline-shape reference for SP+LightGlue's modern equivalent (Source #71 paper Table 3); ⚠️ partial input domain (canonical training on ScanNet indoor + MegaDepth phototourism outdoor — NOT aerial nadir; same caveat as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue + C2 candidates); HARD LICENSE DISQUALIFIER (Source #78 LICENSE byte-for-byte identical to Source #72 SuperPoint LICENSE = Magic Leap noncommercial-research-only SLA, blocks dual-use deployment); TRAINING CODE NOT RELEASED (Source #78 README explicit; D-C2-1 retrain decision STRUCTURALLY BLOCKED for SuperGlue+SuperPoint pinned mode); 4-10× SLOWER THAN LIGHTGLUE (per Source #71 paper §5 + Table 2 — Jetson Orin Nano Super extrapolation ~600-1200 ms per pair fp16 = catastrophic AC-4.1 FAIL); NO PRODUCTIZED Jetson ONNX/TensorRT export pathway (Source #73 LightGlue-ONNX does NOT support SuperGlue; SuperGlue ONNX export is community-maintained third-party only)
  • If ⚠️ or : docs do not disqualify the algorithmic mode at the API level, but THREE CONVERGING HARD-DISQUALIFIERS apply at the deployment level (license + retrain + runtime). The (extractor, matcher, keypoint count, descriptor dimension, input size, normalisation, output shape) tuple is documented and runnable directly via magicleap/SuperGluePretrainedNetwork canonical CLI for inference-only research evaluation. HOWEVER, the three converging hard-disqualifiers (Magic Leap restrictive license + training-code-not-released + 4-10×-slower-than-LightGlue) make SuperGlue+SuperPoint NOT a Selected candidate path for the project's actual deployment — cataloged at mandatory-simple-baseline role only per engine Component Option Breadth rule, structurally analogous to NetVLAD's role in the C2 row. → Status: Mandatory simple-baseline (sparse-matcher reference floor) with three converging HARD-DISQUALIFIERS for project deployment (Magic-Leap-restrictive license byte-for-byte identical to Source #72 + TRAINING-CODE-NOT-RELEASED + 4-10×-SLOWER-THAN-LIGHTGLUE) + aerial-domain-training caveat (D-C2-1 reuse but BLOCKED by training-code-not-released), Magic Leap restrictive license track on extractor + matcher. Final ranking deferred to Jetson MVE phase ONLY for mandatory-simple-baseline-reference role measurement — the role's purpose is to establish the long-established reference floor against which modern leads must measurably exceed; SuperGlue+SuperPoint will NOT be promoted to Selected regardless of MVE results. Per the engine Component Option Breadth rule, SuperGlue+SuperPoint closes the C3 mandatory pre-screen mandatory-simple-baseline role at 4 of N candidates (mandatory-simple-baseline role STRUCTURALLY COMPLETE — no further mandatory-simple-baseline candidates required). Subsequent C3 candidates (XFeat, DoGHardNet+LightGlue, SIFT+LightGlue, etc.) will be separately-cataloged in subsequent sessions if needed.

C3 — Per-numbered-Restriction × Per-numbered-AC Sub-Matrix per Candidate (SuperGlue+SuperPoint mandatory simple-baseline addition)

SuperGlue+SuperPoint — per-numbered binding (C3-relevant lines only; cross-cutting N/A above also apply identically)

Cells share the legend defined under the MixVPR sub-matrix (C2). Where a binding is identical in both substance and evidence to the SP+LightGlue or DISK+LightGlue or ALIKED+LightGlue rows, the SuperGlue+SuperPoint row points to those rows to avoid restating; where SuperGlue+SuperPoint's pinned mode produces a materially different binding (catastrophic latency vs all LightGlue siblings, training-code-not-released vs all LightGlue siblings + DISK + ALIKED, mandatory-simple-baseline-only role), the SuperGlue+SuperPoint row carries a distinct evidence cite.

Line Binding Evidence (one-line cite)
AC-1.1 (frame-center within 50 m, ≥80% normal-flight photos) Pass (documentary on YFCC outdoor) → Verify (aerial nadir cross-domain) — but NOT a Selected path Source #78 README YFCC 4000-pair test (outdoor model) AUC@5/10/20 = 39.02/59.51/75.72, Prec=98.72; documentary pose-estimation accuracy comparable to SP+LightGlue at lower-end-of-AUC tier; HOWEVER, NOT a Selected candidate path due to three converging HARD DISQUALIFIERS (license + retrain + runtime). Aerial nadir cross-domain validation moot since AC-4.1 FAIL is the binding-constraint disqualifier
AC-1.2 (frame-center within 20 m, ≥50% normal-flight photos) Pass (documentary on YFCC outdoor) → Verify — but NOT a Selected path Same as AC-1.1, tighter tail; YFCC AUC@10°=59.51 documentary vs SP+LightGlue MegaDepth-1500 AUC@10°=79.3 = -19.79 absolute (different benchmarks but indicative of ~1-3 absolute lower AUC); NOT a Selected candidate path due to converging hard disqualifiers
AC-2.1b (satellite-anchor registration succeeds, AC-1.1/1.2 + AC-2.2 + AC-8.2 + AC-8.6 conditions) Pass (documentary) → Verify — but NOT a Selected path C3's contribution is the geometric verification step; SuperGlue+SuperPoint provides documented Recall@K + AUC at the mandatory-simple-baseline reference floor that LightGlue siblings measurably exceed per Source #71 paper Table 2; NOT a Selected candidate path — the role is the simple-baseline reference floor only
AC-3.3 (≥3 disconnected segments via satellite-reference re-localization) Pass (per-pair stateless) → NOT a Selected path SuperGlue's per-pair geometric verification is stateless — same as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue; NOT a Selected candidate path for re-localization due to AC-4.1 catastrophic fail at K=10 pairs/frame
AC-4.1 (latency <400 ms p95, end-to-end camera→FC) CATASTROPHIC FAIL — 4-10× SLOWER THAN LIGHTGLUE per Source #71 paper §5 CRITICAL latency-budget DISQUALIFIER: Source #71 paper §5 + Table 2 documents LightGlue is 4-10× faster than SuperGlue at competitive accuracy → SP+SuperGlue ~150-200 ms per pair on RTX 3080 fp16 standard (vs SP+LightGlue 44.2 ms standard / 31.4 ms adaptive). Jetson Orin Nano Super extrapolation: SP+SuperGlue ~600-1200 ms per pair fp16 = AC-4.1 FAIL even at K=1 pair/frame (vs 400 ms budget); at K=10 pairs/frame = 6-12 seconds = CATASTROPHIC FAIL. NO MITIGATION PATHWAY: (i) Source #73 LightGlue-ONNX does NOT support SuperGlue end-to-end ONNX/TensorRT export → no TensorRT acceleration available; (ii) SuperGlue does NOT support FlashAttention-2 → no LightGlue-equivalent ~80% inference speedup available; (iii) SuperGlue does NOT support adaptive-depth/adaptive-width pruning → no LightGlue paper §3.3-equivalent ~33% inference-time reduction available. NOT a Selected candidate path — confirms the engine's Component Option Breadth rule purpose: SuperGlue+SuperPoint mandatory-simple-baseline FORCES modern leads (LightGlue) to demonstrate measurable speedup advantage
AC-4.2 (memory <8 GB shared) Pass (with Verify) — comparable model footprint to SP+LightGlue SuperPoint ~5 MB at fp16 + SuperGlue ~50 MB at fp16 = ~55 MB total model weights (slightly larger than SP+LightGlue's ~27 MB but both well within AC-4.2 budget). Activations comparable to SP+LightGlue. Co-resident memory pressure with C1/C2/C4/C5/C6 manageable — but NOT a Selected candidate path so MVE measurement is mandatory-simple-baseline reference role only
AC-8.1 (cache-interface resolution ≥0.5 m/px, ideally 0.3 m/px) Pass (with Verify) — resolution-agnostic at API level SuperGlue+SuperPoint is resolution-agnostic at the algorithm level; canonical demo evaluates at 640 (indoor) or 1600 (outdoor) largest-edge; NOT a Selected candidate path so cross-resolution validation is mandatory-simple-baseline reference role only
AC-8.6 — Scale-ratio (any UAV-frame ground footprint at deployment altitude must be retrievable) Verify — NOT a Selected path Same as SP+LightGlue scale-ratio row; NOT a Selected candidate path so multi-scale extension consideration is moot
AC-8.6 — Scene change in active-conflict sectors Verify — NOT a Selected path Cross-season + scene-change generalization comparable to SP+LightGlue but with structural disadvantage of no-training-code-released for D-C2-1 retrain mitigation; NOT a Selected candidate path
AC-8.6 — Compute & latency under steady-state and re-loc-trigger CATASTROPHIC FAIL — same AC-4.1 disqualifier Same disqualifier as AC-4.1; SuperGlue+SuperPoint cannot meet steady-state latency budget at K=10 pairs/frame on Jetson Orin Nano Super; NOT a Selected candidate path
AC-NEW-2 (spoofing-promotion latency <3 s p95) CATASTROPHIC FAIL — same AC-4.1 disqualifier Same disqualifier as AC-4.1; SuperGlue+SuperPoint single-pair latency on Jetson ~600-1200 ms vs 3 s budget = within budget at K=1 but fails at K=10 (6-12 s); NOT a Selected candidate path
AC-NEW-6 (imagery freshness — never satellite_anchored on stale-tile match) Pass (mechanical) — NOT a Selected path SuperGlue+SuperPoint produces 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair; freshness-age decision is a downstream C5/C6 filter; NOT a Selected candidate path so mechanical pass is moot
AC-NEW-7 (cache-poisoning safety budget — P(>30 m geo-misalign) <1%, P(>100 m) <0.1%) Pass — STRUCTURAL geometric-verification at simple-baseline reference floor Same as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue rows — C3's per-correspondence confidence threshold + RANSAC inlier selection provides structural geometric-verification layer; but at the simple-baseline reference floor accuracy (1-3 absolute lower AUC than LightGlue siblings per Source #71 paper Table 2); NOT a Selected candidate path so cache-poisoning defense at simple-baseline floor is moot
Restriction "Operational area: eastern/southern Ukraine" — sparse-matcher train-domain match STRUCTURALLY BLOCKED — TRAINING CODE NOT RELEASED per Source #78 README CRITICAL D-C2-1 BLOCKER: Source #78 README explicit statement "We do not intend to release the SuperGlue training code" — D-C2-1 retrain decision is STRUCTURALLY BLOCKED for SuperGlue+SuperPoint pinned mode, unlike SP+LightGlue (where LightGlue training code IS released and SP-reproduction with permissive license is a documented mitigation pathway) or DISK+LightGlue (where DISK training code IS released and colmap/colmap2dataset.py workflow allows aerial retrain) or ALIKED+LightGlue (where ALIKED training code IS released and sparse NRE loss enables low-cost retrain). NOT a Selected candidate path — D-C2-1 retrain decision moot
Restriction "Altitude ≤1 km AGL; terrain assumed flat (rolling steppe / agricultural)" — sparse-matcher scale band match Verify — NOT a Selected path Same as AC-8.6 scale-ratio; NOT a Selected candidate path
Restriction "Weather: predominantly sunny ... seasonal/visibility classes" — sparse-matcher cross-season generalization Verify — NOT a Selected path Cross-season generalization at simple-baseline reference floor; NOT a Selected candidate path since D-C2-1 retrain mitigation is structurally blocked
Restriction "Navigation camera (pinned): ADTi 20MP, 5472×3648" Pass (API) — same downscale as canonical SuperGlue+SuperPoint consumes 1024-largest-edge grayscale input; same downscale as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue. D-C2-3 input-resolution-shape Plan-phase decision applies identically but moot since NOT a Selected candidate path
Restriction "Satellite Imagery — resolution ≥0.5 m/px" — sparse-matcher pipeline at AC-8.1 floor Verify — NOT a Selected path Same as AC-8.1
Restriction "Satellite Imagery — Cache budget: 10 GB" — sparse-matcher cache footprint Pass — NO C3 cache footprint C3 cache footprint is exactly 0 GB — same as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue; SuperGlue+SuperPoint operates on UAV-frame + retrieved-tile pair on-the-fly with no pre-cached match-time state
Restriction "Companion computer: Jetson Orin Nano Super, 8 GB shared" CATASTROPHIC FAIL — same AC-4.1 + NO LIGHTGLUE-ONNX-EQUIVALENT TensorRT pathway CRITICAL Jetson deployment DISQUALIFIER: Source #73 LightGlue-ONNX does NOT support SuperGlue end-to-end ONNX/TensorRT export — Source #73 README changelog supports SP+LightGlue (28 Jun 2023) + DISK+LightGlue (30 Jun 2023) but NO SuperGlue entry; CLI examples use superpoint or disk positional argument only, no superglue variant; citations cite LightGlue + SuperPoint + DISK papers only with no SuperGlue reference. SuperGlue ONNX export is community-maintained third-party only (e.g., onnx-modifier, ONNX SuperGlue community ports), NOT productized. Implication for D-C3-2: SuperGlue+SuperPoint's only Jetson runtime path is (a) PyTorch-fp16 baseline (~600-1200 ms per pair = catastrophic AC-4.1 FAIL even at K=1) or (b) third-party community ONNX exports (operationally complex, accuracy-unverified, not productized). NOT a Selected candidate path — Jetson runtime catastrophically fails AC-4.1 budget
Restriction "License posture (D-C1-1)" — sparse-matcher license-track interaction HARD DISQUALIFIER (byte-for-byte identical to Source #72 SuperPoint LICENSE) — mandatory-simple-baseline-only role CRITICAL on canonical magicleap/SuperGluePretrainedNetwork: Source #78 LICENSE wording is byte-for-byte identical to Source #72 SuperPoint LICENSE = Magic Leap "ACADEMIC OR NON-PROFIT ORGANIZATION NONCOMMERCIAL RESEARCH USE ONLY" Software License Agreement — non-OSI-approved (GitHub API license.spdx_id: "NOASSERTION"). HARD DISQUALIFIER for canonical SuperGlue+SuperPoint pinned mode in project's commercial/dual-use deployment context (eastern/southern Ukraine fixed-wing UAV with AC-NEW-2 spoofing-promotion path is dual-use military by every reasonable interpretation, and the project's question_decomposition.md hard disqualifier list includes "anything whose license blocks military / dual-use deployment"). Mandatory-simple-baseline-only role — SuperGlue+SuperPoint cataloged for engine Component Option Breadth rule compliance only; will NOT be promoted to Selected regardless of D-C1-1 license-posture choice because TRAINING-CODE-NOT-RELEASED + 4-10×-SLOWER are independent disqualifiers from the license disqualifier. The role's purpose is to establish the long-established reference floor against which modern leads (DISK+LightGlue + ALIKED+LightGlue + SP+LightGlue) must measurably exceed at deployment-ready license + Jetson-friendly runtime + retrain-capable training

Fact #51 — XFeat per-mode API capability verification (canonical verlab/accelerated_features lightweight-CNN extractor + matcher with three inference modes [XFeat sparse + XFeat* semi-dense + XFeat+LighterGlue paired-matcher companion]; CVPR 2024 modern-competitive-lead with STRONGEST DOCUMENTED EMBEDDED-DEPLOYMENT SIGNAL + STRONGEST RETRAIN-FRIENDLINESS SIGNAL among all C3 candidates evaluated; on Jetson Orin Nano Super) — DOCUMENTARY PASS WITH APACHE-2.0-CLEAN-LICENSE-THROUGHOUT + APACHE-2.0-CLEAN-LIGHTERGLUE-COMPANION-MODE + ORANGE-PI-ZERO-3-1.8-FPS-EMBEDDED-DEPLOYMENT-EVIDENCE + 36-HOURS-RTX-4090-6.5-GB-VRAM-RETRAIN-EVIDENCE + NO-PRODUCTIZED-ONNX/TENSORRT-EXPORT-PATHWAY-CAVEAT (D-C3-2 HARSHER vs DISK BUT TECHNICALLY SIMPLER vs ALIKED) + AERIAL-DOMAIN-TRAINING-CAVEAT (D-C2-1 REUSE) + 64-D-DESCRIPTOR-COMPACT-CACHE-ADVANTAGE + D-C3-6 NEW XFeat-mode-choice; closes C3 mandatory pre-screen at 5/N (modern-competitive-lead role)

  • Statement: XFeat (verlab/accelerated_features CVPR 2024; canonical implementation by Guilherme Potje + Felipe Cadar + André Araújo + Renato Martins + Erickson Nascimento, UFMG VerLab + Université de Bourgogne + Google Research + Université de Lorraine + Microsoft cross-affiliations; Apache-2.0 throughout per Source #80 GitHub API metadata license.spdx_id: "Apache-2.0") is the MODERN-COMPETITIVE-LEAD reference for the C3 row's lightweight-CNN axis with THREE PRIMARY INFERENCE MODES: (i) XFeat sparse — top-K up to 4096 keypoints + 64-D float descriptors + Mutual Nearest Neighbor (MNN) matching; (ii) XFeat* semi-dense — up to 10k features + 2-scale processing (0.65× + 1.3× input resize) + MNN + lightweight MLP-based offset refinement (offset prediction confidence threshold 0.2); (iii) XFeat+LighterGlue paired-matcher — VerLab-trained smaller LightGlue variant ~3× faster than original LightGlue per Source #80 README claim, distributed in-tree via xfeat+lg_torch_hub.ipynb. Per the per-Mode API Capability Verification rule, the project's pinned mode is the (XFeat extractor at 1024-largest-edge grayscale or RGB input + 1024 max keypoints + 64-D float descriptors) + (matcher mode = XFeat sparse with MNN OR XFeat+LighterGlue paired-matcher with cvg/glue-factory-trained LighterGlue) → up to 1024 2D-2D correspondences with confidence scores feeding the project's downstream C4 PnP+RANSAC pose estimator. The canonical inference API is the simplest of any C3 candidate evaluated: 3-line PyTorch native (from modules.xfeat import XFeat; xfeat = XFeat(); output = xfeat.detectAndCompute(torch.randn(1,3,480,640), top_k=4096)[0]) or Torch Hub one-liner (torch.hub.load('verlab/accelerated_features', 'XFeat', pretrained=True, top_k=4096)). Mode-enumeration query (1/3) — context7 NOT INDEXED + WebFetch fallback PASS: context7 resolve-library-id returned just-sultanov/xfeat git-worktree-management CLI utility (UNRELATED to canonical XFeat feature-matching library); per Per-Mode API Capability Verification rule item 2, fall-back to official-docs WebFetch on canonical verlab/accelerated_features README + GitHub API license metadata (Source #80) plus canonical paper arXiv:2404.19174 (Source #81) was used. Pinned-mode runnable example query (2/3) — WebFetch PASS: Source #80 README + Source #81 paper §4 ship eight Colab notebooks (minimal_example.ipynb, xfeat_matching.ipynb, xfeat_torch_hub.ipynb, XFeat_training_example.ipynb, xfeat+lg_torch_hub.ipynb) plus three evaluation scripts (python3 -m modules.eval.megadepth1500 --matcher xfeat --ransac-thr 2.5 + python3 -m modules.eval.scannet1500 + per-method realtime_demo.py) that produce documented MegaDepth-1500 + ScanNet-1500 + HPatches AUC/MHA numbers reproducing paper Table 1 + Table 2 + Table 3. Disqualifier-probe query (3/3) — TWO POSITIVE FINDINGS + ONE NEGATIVE FINDING: (a) STRONGEST DOCUMENTED EMBEDDED-DEPLOYMENT SIGNAL AMONG ALL C3 CANDIDATES EVALUATED — Source #81 paper Appendix C explicit Orange Pi Zero 3 ($28 ARM Cortex-A53 device) at 480×360 input documents XFeat=1.8 FPS vs SuperPoint=0.16 FPS (11.25× faster) vs ALIKE=0.58 FPS (3.1× faster); paper explicitly states "XFeat is the ONLY learned method capable of running over 1 FPS on highly-constrained embedded device" without neural-network-inference optimization at 2024 publication time; Source #80 README explicitly states "Simple architecture components which facilitates deployment on embedded devices (jetson, raspberry pi, custom AI chips, etc..)" — strongest embedded-deployment story among all C3 candidates evaluated; (b) STRONGEST RETRAIN-FRIENDLINESS SIGNAL AMONG ALL C3 CANDIDATES EVALUATED — Source #81 paper §3.3 + Appendix B explicit "trained on a single NVIDIA RTX 4090 GPU, consuming 6.5 GB of VRAM in total, considering both training and synthetic warps done on the fly on GPU" + 36 hours total convergence + batch size 10 + 160k iterations + Adam LR 3e-4 + exponential decay; paper §3.3 explicit "low memory usage of our method enables training on entry-level hardware, facilitating the fine-tuning or full training of our network for specific tasks and scene types"; MUCH cheaper than DISK+LightGlue (~2 weeks 32 GB V100 per Source #77) + comparable to ALIKED+LightGlue (~24 hours RTX 3090 per Source #74) + infinitely better than SuperGlue+SuperPoint (training-code-not-released per Source #78); (c) NO PRODUCTIZED ONNX/TensorRT EXPORT PATHWAY in canonical repo — Source #80 README Contributing section explicit ask "Currently, it would be nice to have an export script to efficient deployment engines such as TensorRT and ONNX"; ONNX/TensorRT export is community-contribution-needed, NOT productized in verlab/accelerated_features master HEAD or in any companion repo equivalent to Source #73 LightGlue-ONNX. D-C3-2 gate is HARSHER than DISK+LightGlue (which has Source #73 LightGlue-ONNX TensorRT pathway with fp16 + FlashAttention-2 + TopK-trick acceleration) but TECHNICALLY SIMPLER than ALIKED+LightGlue (which has torchvision.ops.deform_conv2d ONNX-export-difficulty blocker) — XFeat is CNN-only with no deformable convolutions or unusual ops, just Conv + ReLU + BatchNorm; project would need to invest custom-ONNX-export engineering effort but the architecture is straightforward and would not encounter ALIKED's deform_conv2d blocker. Documentary headline performance (per Source #81 paper Table 1 MegaDepth-1500 i5-1135G7 CPU VGA, AUC@5°/10°/20° + FPS): SuperPoint = 37.3/50.1/61.5 at 3.0 FPS (4096 kpts) / DISK = 53.8/65.9/75.0 at 1.2 FPS / DISK* = 55.2/66.8/75.3 at 1.2 FPS (10k kpts) / ALIKE-Tiny = 49.4/61.8/71.4 at 5.3 FPS / XFeat sparse = 42.6/56.4/67.7 at 27.1 FPS (4096 kpts; 9× faster than SuperPoint at HIGHER AUC + 5× faster than ALIKE) / XFeat* semi-dense = 50.2/65.4/77.1 at 19.2 FPS (10k features; comparable to DISK* at 16× speedup with 1885 inliers per pair vs LightGlue 475 inliers); paper Table 2 ScanNet-1500 indoor: XFeat 16.7/32.6/47.8 + XFeat* 18.4/34.7/50.3 outperforms ALL baselines including SuperPoint=12.5/24.4/36.7 + DISK=9.6/11.3 + ALIKE=8.0 despite all methods being MegaDepth-trained (paper Appendix E attributes XFeat's superior cross-domain generalization to hybrid MegaDepth+synthetic-warp-COCO training reducing landmark-dataset overfitting bias); paper Table 3 HPatches homography MHA@3 Illumination/Viewpoint = 95.0/68.6 (XFeat) — best illumination@3 in paper Table 3 across all evaluated methods including SuperPoint 94.6 + DISK 94.6. Documentary headline performance vs LightGlue siblings (per Source #80 README MegaDepth-1500 cross-cite vs SP+LightGlue): XFeat+LighterGlue Fast (640 max dim, 1300 kpts) AUC@5/10/20 = 0.444/0.610/0.746 vs SP+LightGlue 0.469/0.633/0.762 (-2.5/-2.3/-1.6 absolute); Accurate (1024 max dim, 4096 kpts) AUC@5/10/20 = 0.564/0.710/0.819 vs SP+LightGlue 0.591/0.738/0.841 (-2.7/-2.8/-2.2 absolute) — XFeat+LighterGlue is modestly below SP+LightGlue at competitive accuracy + ~3× LighterGlue speedup. Pinned-mode sentence: "We will catalog XFeat (lightweight-CNN extractor + sparse/semi-dense/LighterGlue-paired matcher) with the canonical Apache-2.0 weights from verlab/accelerated_features as the MODERN-COMPETITIVE-LEAD reference for the C3 row's lightweight-CNN axis at the documentary level. Inputs {1× ADTi 20MP nav frame stream → grayscale-converted-or-RGB + bilinearly downscaled-to-largest-edge 1024 + 1× cached satellite tile per top-K retrieval result from C2}; expected outputs {up to 1024 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair feeding C4 PnP+RANSAC}; runtime Jetson Orin Nano Super (8 GB shared, JetPack 6, ROS 2 Humble); the pinned mode preserves XFeat's clean Apache-2.0 license track + the strongest documented embedded-deployment signal + the strongest retrain-friendliness signal among all C3 candidates evaluated; trade-off — D-C3-2 ONNX/TensorRT export pathway is community-contribution-needed, not productized in canonical repo (HARSHER than DISK+LightGlue but TECHNICALLY SIMPLER than ALIKED+LightGlue because XFeat is CNN-only with no deformable convolutions)." MODERN-COMPETITIVE-LEAD ROLE per engine Component Option Breadth rule — XFeat closes the C3 mandatory pre-screen at 5 of N candidates (SP+LightGlue at 1/N + ALIKED+LightGlue at 2/N + DISK+LightGlue at 3/N + SuperGlue+SuperPoint mandatory-simple-baseline at 4/N + XFeat modern-competitive-lead at 5/N this session). XFeat is the only modern-competitive-lead C3 candidate with explicit embedded-device benchmarks + low retrain cost, expanding the C3 row's modern-competitive-lead axis with a structurally-different design point (lightweight CNN with decoupled keypoint detection + lightweight MLP-based match refinement vs LightGlue's transformer-based attention matcher). D-C3-6 NEW Plan-phase decision raised: XFeat-mode-choice between (a) XFeat sparse with MNN matching for SIMPLEST deployment (no separate matcher network required, fewest moving parts), (b) XFeat* semi-dense with MNN+offset-refinement for HIGHEST inlier count per pair (1885 vs LightGlue 475 per Source #81 Appendix F Table 6), (c) XFeat+LighterGlue paired-matcher for MODERN learned-matcher accuracy with VerLab-trained LighterGlue ~3× faster than canonical LightGlue per Source #80 README claim. No new D-C3-2 sub-decision raised by XFeat closure beyond the inherited LightGlue-inference-runtime D-C3-2 (which applies to XFeat+LighterGlue companion mode same as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue); the XFeat-only standalone modes (sparse + semi-dense) sidestep D-C3-2 entirely since they don't depend on cvg/LightGlue's matcher backbone. D-C2-1 retrain decision REUSE with strongest retrain-friendliness signal — for D-C2-1 = (a) project-domain retrain on aerial nadir corpus, XFeat is materially the cheapest C3 candidate to retrain (36 hours single RTX 4090 / 6.5 GB VRAM total vs DISK ~2 weeks 32 GB V100 / vs ALIKED ~24 hours RTX 3090 / SuperGlue training-code-not-released-blocked).
  • Source: Source #80 (verlab/accelerated_features canonical README + LICENSE + GitHub API license metadata — Apache-2.0 throughout per license.spdx_id: "Apache-2.0"; three inference modes XFeat sparse / XFeat* semi-dense / XFeat+LighterGlue paired-matcher companion; minimalist 3-line inference API + Torch Hub one-liner + 8 Colab notebooks + 3 evaluation scripts + 1 real-time webcam demo; TRAINING CODE RELEASED including Colab notebook + canonical training command; NO PRODUCTIZED ONNX/TensorRT EXPORT per Contributing section ask; documentary results Source #81 paper Table 1 MegaDepth-1500 + Table 2 ScanNet-1500 + Table 3 HPatches + Appendix C Orange Pi Zero 3 embedded-device timing + Appendix F learned-matcher comparison; CVPR 2024 publication with cross-affiliations including UFMG + Université de Bourgogne + Google Research + Université de Lorraine + Microsoft), Source #81 (canonical paper arXiv:2404.19174 / Potje et al. CVPR 2024 — §3 architecture [featherweight CNN backbone with channel sequence {4,8,24,64,64,128} + 23 conv layers + 6 spatial-halving blocks + 2 fusion blocks + decoupled keypoint detection branch with 1×1 convolutions on 8×8 tensor-block-transformed image with knowledge distillation from ALIKE-Tiny teacher + lightweight MLP-based match refinement module] + §3.3 training [hybrid MegaDepth+synthetic-warp-COCO at 6:4 ratio + 800×600 input + batch size 10 + 160k iterations + Adam LR 3e-4 + 36 hours single RTX 4090 + 6.5 GB VRAM total] + §4 experiments [MegaDepth-1500 + ScanNet-1500 + HPatches + Aachen Day-Night + learned-matcher comparison] + Appendix B detailed training description + Appendix C detailed timing analysis on i7-6700K CPU + Orange Pi Zero 3 ARM Cortex-A53 embedded device + Appendix E ScanNet-1500 extended discussion + Appendix F learned-matcher comparison Table 6), Source #71 cross-cite (cvg/LightGlue paper Appendix A — XFeat+LighterGlue companion mode is trained using cvg/glue-factory framework that the LightGlue paper introduces for matcher training), Source #73 cross-cite (fabio-sim/LightGlue-ONNX companion repo — does NOT support XFeat or XFeat+LighterGlue end-to-end ONNX/TensorRT pipeline as of January 2026; LightGlue-ONNX changelog + CLI examples + citations support SuperPoint+LightGlue + DISK+LightGlue extractors only, NO XFeat entry — confirms XFeat's D-C3-2 ONNX/TensorRT export pathway is community-contribution-needed)
  • Phase: Phase 2
  • Target Audience: System architects + C3 implementer + Step-7.5 reviewer + license-posture decision-maker (D-C1-1 — clean Apache-2.0 throughout) + Plan-phase architect (modern-competitive-lead role for the C3 row's lightweight-CNN axis with strongest documented embedded-deployment signal + strongest retrain-friendliness signal among all C3 candidates evaluated)
  • Confidence: for mode-enumeration (three primary inference modes XFeat sparse + XFeat* semi-dense + XFeat+LighterGlue paired-matcher canonical companion mode wired in canonical repo + Torch Hub one-liner; eight Colab notebooks distributed in-tree), runnable-example (3-line PyTorch native API + Torch Hub one-liner + canonical evaluation harnesses for MegaDepth-1500 + ScanNet-1500 in Source #80 with explicit recommended configs), license (Apache-2.0 confirmed via Source #80 GitHub API license.spdx_id: "Apache-2.0" + README badge + LICENSE file present in canonical repo); for documentary MegaDepth-1500 + ScanNet-1500 + HPatches + Orange Pi Zero 3 numbers (per Source #81 paper Tables 1, 2, 3 + Appendix C); for STRONGEST DOCUMENTED EMBEDDED-DEPLOYMENT SIGNAL AMONG ALL C3 CANDIDATES EVALUATED (Orange Pi Zero 3 1.8 FPS at 480×360 input vs SuperPoint 0.16 FPS / ALIKE 0.58 FPS — 11.25× / 3.1× speedup at the same resolution); for STRONGEST RETRAIN-FRIENDLINESS SIGNAL AMONG ALL C3 CANDIDATES EVALUATED (36 hours single RTX 4090 + 6.5 GB VRAM total vs DISK ~2 weeks 32 GB V100 + ALIKED ~24 hours RTX 3090 + SuperGlue training-code-not-released-blocked); for NO PRODUCTIZED ONNX/TensorRT EXPORT PATHWAY in canonical repo (README Contributing section explicit community-contribution ask); for 64-D-DESCRIPTOR-COMPACT-CACHE-ADVANTAGE (XFeat 64-D vs SuperPoint 256-D vs DISK 128-D vs ALIKED 128-D = smallest descriptor dimensionality of any modern competitive C3 candidate evaluated, providing 4× / 2× / 2× cache footprint reduction in scenarios that require descriptor caching for the C3 path — but NOT applicable to the project's C3 row since C3 operates on UAV-frame + retrieved-tile pair on-the-fly with no pre-cached match-time descriptor state per Fact #47 + #48 + #49 disqualifier-probe rows); for D-C3-6 NEW Plan-phase decision raised (XFeat-mode-choice between sparse / semi-dense / +LighterGlue paired-matcher); ⚠️ for Jetson Orin Nano Super deployment latency / memory / accuracy (no documentary measurement; extrapolation from Orange Pi Zero 3 1.8 FPS ARM Cortex-A53 + Source #81 paper Table 1 27.1 FPS Intel i5-1135G7 CPU VGA suggests strongest extrapolated latency advantage among all C3 candidates evaluated when paired with TensorRT acceleration — but the absence of productized ONNX/TensorRT export pathway means the project must invest custom export engineering effort to realize this advantage); for canonical-checkpoint aerial-domain fitness (canonical training on MegaDepth phototourism outdoor + COCO_20k synthetic warp pairs at 6:4 ratio — NOT aerial nadir; same caveat as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue + SuperGlue+SuperPoint + C2 candidates, D-C2-1 reuse — but XFeat is the cheapest C3 candidate to execute D-C2-1 = (a) project-domain retrain on aerial nadir corpus at 36 hours single RTX 4090 + 6.5 GB VRAM total per Source #81 §3.3); for canonical-paper Aachen Day-Night documentary numbers (Source #81 paper §4.3 references Aachen Day-Night visual localization but headline numbers not extracted in this session — would need additional WebFetch on paper Table 4 or Appendix to confirm; cross-paper reference via Source #71 LightGlue paper Table 3 Aachen Day-Night documentary numbers does NOT include XFeat as a baseline because LightGlue paper publication 2023 predates XFeat publication 2024 by ~1 year)
  • Related Dimension: SQ3+SQ4 / C3 modern-competitive-lead reference (engine Component Option Breadth rule role — modern-competitive-lead axis expansion with structurally-different design point [lightweight CNN with decoupled keypoint detection + lightweight MLP-based match refinement vs LightGlue's transformer-based attention matcher]) — per-mode API capability verification gate
  • Fit Impact: DOCUMENTARY PASS for the per-mode API capability verification gate at the modern-competitive-lead role — XFeat has a documented runnable per-mode example with the project's pinned configuration (canonical verlab/accelerated_features + canonical pretrained weights + canonical paper algorithmic specification + canonical paper benchmark numbers), three documented primary inference modes (XFeat sparse / XFeat* semi-dense / XFeat+LighterGlue paired-matcher), and NO API-level disqualifier. Three CONVERGING POSITIVE structural advantages: (i) APACHE-2.0 LICENSE THROUGHOUT — canonical repo + LICENSE + companion XFeat+LighterGlue mode all clean Apache-2.0; eligible on every D-C1-1 license-posture path with the cleanest license-compliance story tied with DISK+LightGlue; (ii) STRONGEST DOCUMENTED EMBEDDED-DEPLOYMENT SIGNAL AMONG ALL C3 CANDIDATES EVALUATED — Source #81 paper Appendix C Orange Pi Zero 3 ARM Cortex-A53 1.8 FPS without optimization at 480×360 input, designed explicitly for "jetson, raspberry pi, custom AI chips, etc." per Source #80 README; (iii) STRONGEST RETRAIN-FRIENDLINESS SIGNAL AMONG ALL C3 CANDIDATES EVALUATED — 36 hours single RTX 4090 + 6.5 GB VRAM total per Source #81 §3.3, materially cheaper than DISK + comparable to ALIKED + infinitely better than SuperGlue (training-code-not-released). One NEGATIVE structural finding: (iv) NO PRODUCTIZED ONNX/TensorRT EXPORT PATHWAY in canonical repo (Source #80 README Contributing section explicit community-contribution ask) — D-C3-2 gate is HARSHER than DISK+LightGlue's well-documented LightGlue-ONNX TensorRT pathway (Source #73), but TECHNICALLY SIMPLER than ALIKED+LightGlue's torchvision.ops.deform_conv2d ONNX-export blocker because XFeat is CNN-only with no deformable convolutions or unusual ops. Two ADDITIONAL CAVEATS: (v) MegaDepth-1500 sparse-mode AUC@5°=42.6 is materially below DISK 53.8 + DISK* 55.2 + ALIKE-Tiny 49.4 at strictest tier (paper Table 1) — XFeat sparse is positioned as "competitive at much higher speed" rather than "best-accuracy"; XFeat+LighterGlue narrows this gap on the LightGlue-paired modes (XFeat+LighterGlue Accurate AUC@5°=0.564 = -2.7 absolute below SP+LightGlue 0.591 per Source #80 README); (vi) AERIAL-DOMAIN-TRAINING CAVEAT shared with all C3 candidates evaluated (canonical training on MegaDepth phototourism + COCO_20k synthetic warp pairs at 6:4 ratio — NOT aerial nadir; D-C2-1 reuse with XFeat as cheapest retrain candidate at 36 hours single RTX 4090). NEW Plan-phase decision raised by XFeat closure: D-C3-6 NEW XFeat-mode-choice — Plan-phase decision between (a) XFeat sparse with MNN matching for SIMPLEST deployment (no separate matcher network required, fewest moving parts; D-C3-2 fully sidesteps cvg/LightGlue dependency for the standalone-extractor mode), (b) XFeat* semi-dense with MNN+offset-refinement for HIGHEST inlier count per pair (1885 inliers vs LightGlue 475 per Source #81 Appendix F Table 6 = 4× more inliers per pair, valuable for the project's downstream C4 PnP+RANSAC pose estimator stability), (c) XFeat+LighterGlue paired-matcher for MODERN learned-matcher accuracy with VerLab-trained LighterGlue ~3× faster than canonical LightGlue per Source #80 README claim. NO REUSE of D-C3-2 Jetson runtime path choice for XFeat sparse + XFeat* semi-dense modes — these standalone-extractor modes do NOT depend on cvg/LightGlue's matcher backbone, so D-C3-2 LightGlue-inference-runtime choice does NOT apply; project would need custom XFeat ONNX/TensorRT export effort regardless of D-C3-2 decision. D-C3-2 REUSE for XFeat+LighterGlue paired-matcher mode — same Jetson runtime path choices apply (PyTorch-fp16 / Torch-TensorRT / ONNX Runtime + TensorRT EP / pure TensorRT via trtexec / FP8 ModelOpt-on-Jetson if Ampere FP8 emulation works) but the LighterGlue smaller variant is NOT distributed in Source #73 LightGlue-ONNX repo as of January 2026 — community-contribution-needed for productized export. C3 mandatory pre-screen status: XFeat closes the C3 mandatory pre-screen at 5 of N candidates (SP+LightGlue at 1/N + ALIKED+LightGlue at 2/N + DISK+LightGlue at 3/N + SuperGlue+SuperPoint mandatory-simple-baseline at 4/N + XFeat modern-competitive-lead at 5/N this session). The modern-competitive-lead axis is materially-expanded for the C3 row with a structurally-different design point (lightweight CNN with decoupled keypoint detection + lightweight MLP-based match refinement vs LightGlue's transformer-based attention matcher). License: Apache-2.0 throughout for canonical XFeat extractor + matcher (Source #80) AND XFeat+LighterGlue companion mode (Source #80 + cross-cite Source #70 cvg/LightGlue Apache-2.0 + cross-cite Source #71 cvg/glue-factory training framework Apache-2.0); under D-C1-1 = (a) GPL-3.0 track, (b) BSD/permissive lock, or (c) keep-both-tracks-open, XFeat is eligible on every license-posture choice with the cleanest license-compliance story TIED with DISK+LightGlue. Position vs all prior C3 candidates: XFeat is the first C3 candidate with explicit embedded-device benchmarks + materially-cheapest retrain cost; structurally-different design point from all LightGlue-extractor-siblings; documented to outperform SuperPoint + ALIKE + DISK + DISK* on ScanNet-1500 indoor cross-domain transfer despite all methods being MegaDepth-trained (paper Table 2 + Appendix E hybrid-training generalization advantage); positioned as competitive with much-larger learned matchers (LightGlue + LoFTR) at much higher throughput per Source #81 Appendix F Table 6. Final ranking deferred to Jetson MVE phase per the project's D-C1-2 + D-C3-2 deferred-MVE strategy.

C3 — Per-Mode API Capability Verification (engine Step 2 — XFeat modern-competitive-lead session entry, 2026-05-08)

MVE — XFeat with three primary inference modes (XFeat sparse 4096-keypoint MNN + XFeat* semi-dense 10k-feature MLP-offset-refinement + XFeat+LighterGlue paired-matcher) + 64-D float descriptors @ 1024×1024 grayscale-or-RGB → up to 1024-4096 2D-2D correspondences (canonical modern-competitive-lead reference; D-C3-6 NEW Plan-phase decision required for XFeat-mode-choice between sparse / semi-dense / +LighterGlue)

  • Source: Source #80 (verlab/accelerated_features canonical README + GitHub API license metadata — minimalist 3-line PyTorch native API from modules.xfeat import XFeat; xfeat = XFeat(); output = xfeat.detectAndCompute(torch.randn(1,3,480,640), top_k=4096)[0] for canonical pretrained inference + Torch Hub one-liner torch.hub.load('verlab/accelerated_features', 'XFeat', pretrained=True, top_k=4096); eight Colab notebooks distributed in-tree; canonical pretrained weights via Torch Hub pretrained=True; TRAINING CODE RELEASED with Colab notebook + canonical training command), accessed 2026-05-08; Source #81 (canonical paper arXiv:2404.19174 / Potje et al. CVPR 2024 — §3 architecture, §3.3 training, §4 experiments, Appendix B detailed training, Appendix C detailed timing analysis on Orange Pi Zero 3 ARM Cortex-A53, Appendix F learned-matcher comparison Table 6); Source #71 cross-cite (cvg/LightGlue paper §1 — XFeat+LighterGlue companion mode trained using cvg/glue-factory framework that LightGlue paper introduces); Source #73 cross-cite (fabio-sim/LightGlue-ONNX companion — does NOT support XFeat or XFeat+LighterGlue end-to-end ONNX/TensorRT pipeline as of January 2026, confirming XFeat's D-C3-2 ONNX/TensorRT export pathway is community-contribution-needed)
  • Inputs in the example: Two arbitrary RGB or grayscale images at any (independent) resolutions; canonical demos use VGA (640×480) or 1024-largest-edge for accurate config; XFeat extractor produces per-image dict {keypoints: (N, 2), scores: (N,), descriptors: (64, N) or (N, 64)} where N ≤ top_k (canonical default 4096 for accurate, project pinned to 1024); for XFeat sparse mode: MNN search with 64-D descriptors directly produces 2D-2D correspondences; for XFeat* semi-dense mode: 2-scale processing (0.65× + 1.3× resize) + up to 10k features + MNN + lightweight MLP offset refinement (offset prediction confidence threshold 0.2); for XFeat+LighterGlue paired-matcher mode: per-image XFeat extractor output + LighterGlue paired-matcher with ~3× faster runtime than canonical LightGlue per Source #80 README claim
  • Outputs in the example: Up to 1024-4096 2D-2D correspondences with confidence scores; canonical README + paper documentary results: MegaDepth-1500 (Source #81 paper Table 1, i5-1135G7 CPU VGA, AUC@5°/10°/20° + FPS): XFeat sparse 42.6/56.4/67.7 at 27.1 FPS / XFeat* semi-dense 50.2/65.4/77.1 at 19.2 FPS; ScanNet-1500 (Source #81 paper Table 2): XFeat 16.7/32.6/47.8 + XFeat* 18.4/34.7/50.3 — best in row; HPatches (Source #81 paper Table 3): XFeat MHA@3 Illumination 95.0 / Viewpoint 68.6; XFeat+LighterGlue MegaDepth-1500 (Source #80 README cross-cite): Fast (640 max dim, 1300 kpts) AUC@5/10/20 = 0.444/0.610/0.746 vs SP+LightGlue 0.469/0.633/0.762 (-2.5/-2.3/-1.6 absolute); Accurate (1024 max dim, 4096 kpts) AUC@5/10/20 = 0.564/0.710/0.819 vs SP+LightGlue 0.591/0.738/0.841 (-2.7/-2.8/-2.2 absolute); EMBEDDED-DEVICE TIMING (Source #81 Appendix C, Orange Pi Zero 3 ARM Cortex-A53 at 480×360): XFeat=1.8 FPS vs SuperPoint=0.16 FPS vs ALIKE=0.58 FPS — XFeat is the ONLY learned method capable of running over 1 FPS on highly-constrained embedded device without neural-network-inference optimization
  • Project inputs: 1× ADTi 20MP nav frame stream (5472×3648, target 3 fps) → grayscale-converted-or-RGB → bilinearly downscaled-to-largest-edge 1024 → fp16 batch on Jetson Orin Nano Super; per-UAV-frame K=10 top-K retrieved satellite tiles from C2 → grayscale-or-RGB → 1024-largest-edge → fp16; NOTE: XFeat supports both grayscale and RGB input per paper §3.1 and README minimal example (PyTorch tensor (B, 3, H, W)); preserves dual-input-mode flexibility
  • Project outputs required: Up to 1024 2D-2D correspondences per (UAV-frame, satellite-tile) image pair with confidence scores; documentary expectation per Source #81 paper Table 1 + Source #80 README cross-cite: XFeat sparse should provide AUC@5°/10°/20° ≈ 42.6/56.4/67.7 documentary baseline + XFeat+LighterGlue should provide AUC@5°/10°/20° ≈ 0.564/0.710/0.819 (Accurate config) at the project's 1024 max dim + 4096 kpt budget (canonical paper config = project pinned config). Latency budget extrapolation to Jetson Orin Nano Super: XFeat is the strongest extrapolated latency candidate among all C3 candidates evaluated based on Orange Pi Zero 3 ARM Cortex-A53 1.8 FPS (5.5× headroom over Jetson Orin Nano's GPU-based fp16 path) — but realization requires custom ONNX/TensorRT export effort due to D-C3-2 community-contribution-needed ONNX/TensorRT export pathway in canonical repo. Canonical PyTorch-fp16 path on Jetson Orin Nano Super: extrapolated to ~10-30 ms per pair (compared to ALIKED's ~70-140 ms PyTorch-fp16 / DISK's ~200-400 ms PyTorch-fp16 / SP+LightGlue's ~30-60 ms PyTorch-fp16) — comparable to SP+LightGlue at competitive accuracy + no Magic Leap restrictive license disqualifier. At K=10 pairs/frame extrapolated 100-300 ms total = comfortable AC-4.1 satisfaction
  • Match assessment: exact mode match for (XFeat lightweight-CNN extractor at 1024-largest-edge grayscale-or-RGB input, 1024 max keypoints, 64-D float descriptors, three matcher modes XFeat sparse with MNN / XFeat* semi-dense with MNN+MLP-offset-refinement / XFeat+LighterGlue paired-matcher with VerLab-trained LighterGlue, up to 1024 2D-2D correspondences output with confidence scores); inference API (3-line PyTorch native + Torch Hub one-liner) exists in canonical verlab/accelerated_features (Source #80); three primary inference modes documented (sparse + semi-dense + LighterGlue-paired); companion cvg/glue-factory framework for LighterGlue training (Source #71 cross-cite); ⚠️ partial input domain (canonical training on MegaDepth phototourism outdoor + COCO_20k synthetic warp pairs at 6:4 ratio — NOT aerial nadir; same caveat as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue + SuperGlue+SuperPoint + C2 candidates); ⚠️ NO documentary aerial-domain validation (D-C2-1 reuse — but XFeat is the cheapest C3 candidate to execute D-C2-1 = (a) project-domain retrain at 36 hours single RTX 4090 + 6.5 GB VRAM total per Source #81 §3.3); NO PRODUCTIZED ONNX/TensorRT EXPORT PATHWAY in canonical repo (Source #80 README Contributing section explicit community-contribution ask) — D-C3-2 gate HARSHER than DISK+LightGlue but TECHNICALLY SIMPLER than ALIKED+LightGlue
  • If ⚠️ or : docs do not disqualify the algorithmic mode at the API level, and NO HARD DISQUALIFIERS apply at the deployment level. The (extractor, matcher mode 1/2/3, keypoint count, descriptor dimension, input size, normalisation, output shape) tuple is documented and runnable directly via canonical verlab/accelerated_features repo for inference + training + evaluation. Three POSITIVE structural advantages (clean Apache-2.0 throughout + strongest embedded-deployment signal among all C3 candidates evaluated + strongest retrain-friendliness signal among all C3 candidates evaluated) make XFeat eligible as a Selected candidate path under every D-C1-1 license-posture choice. Two CAVEATS: (i) MegaDepth-1500 sparse-mode AUC@5°=42.6 is materially below DISK+LightGlue + ALIKED+LightGlue + SP+LightGlue at strictest tier (-7 to -25 absolute) — XFeat sparse is positioned as "competitive at much higher speed" rather than "best-accuracy"; XFeat+LighterGlue narrows the gap (-2.5 to -2.8 absolute vs SP+LightGlue) but does not exceed; (ii) NO PRODUCTIZED ONNX/TensorRT EXPORT PATHWAY in canonical repo — project would need custom-ONNX-export engineering effort but the architecture is straightforward (Conv + ReLU + BatchNorm only, no deformable convolutions or graph-neural-network attention export complexity). → Status: Modern-competitive-lead with three converging POSITIVE structural advantages (clean Apache-2.0 throughout + strongest embedded-deployment signal among all C3 candidates evaluated + strongest retrain-friendliness signal among all C3 candidates evaluated) + ONE NEGATIVE structural finding (NO PRODUCTIZED ONNX/TENSORRT EXPORT PATHWAY) + AERIAL-DOMAIN-TRAINING CAVEAT (D-C2-1 reuse but XFeat is cheapest retrain) + MegaDepth-1500-sparse-mode-modestly-below-LightGlue-siblings CAVEAT (XFeat+LighterGlue narrows gap), Apache-2.0 license track on extractor + matcher (and LighterGlue companion). Final ranking deferred to Jetson MVE phase per the project's D-C1-2 + D-C3-2 deferred-MVE strategy. Per the engine Component Option Breadth rule, XFeat closes the C3 mandatory pre-screen modern-competitive-lead axis at 5 of N candidates (modern-competitive-lead axis materially-expanded with structurally-different design point [lightweight CNN with decoupled keypoint detection + lightweight MLP-based match refinement vs LightGlue's transformer-based attention matcher]). Subsequent C3 candidates (DoGHardNet+LightGlue additional cvg/LightGlue extractor-matcher sibling, SIFT+LightGlue classical-detector pairing, etc.) will be separately-cataloged in subsequent sessions if needed.

C3 — Per-numbered-Restriction × Per-numbered-AC Sub-Matrix per Candidate (XFeat modern-competitive-lead addition)

XFeat — per-numbered binding (C3-relevant lines only; cross-cutting N/A above also apply identically)

Cells share the legend defined under the MixVPR sub-matrix (C2). Where a binding is identical in both substance and evidence to the SP+LightGlue or DISK+LightGlue or ALIKED+LightGlue rows, the XFeat row points to those rows to avoid restating; where XFeat's pinned mode produces a materially different binding (modern-competitive-lead role with strongest embedded-deployment signal + cleanest retrain story but no productized ONNX/TensorRT export pathway), the XFeat row carries a distinct evidence cite.

Line Binding Evidence (one-line cite)
AC-1.1 (frame-center within 50 m, ≥80% normal-flight photos) Pass (documentary on MegaDepth-1500 + ScanNet-1500) → Verify (aerial nadir cross-domain) Source #81 paper Table 1 MegaDepth-1500 (XFeat sparse AUC@5°/10°/20° = 42.6/56.4/67.7 / XFeat* semi-dense = 50.2/65.4/77.1) + Table 2 ScanNet-1500 (XFeat outperforms ALL baselines including SuperPoint+DISK+ALIKE on indoor cross-domain transfer despite all methods being MegaDepth-trained per Appendix E hybrid-training generalization advantage). XFeat+LighterGlue narrows MegaDepth-1500 gap to within -2.5 absolute of SP+LightGlue per Source #80 README cross-cite. D-C2-1 retrain decision REUSE with XFeat-strongest-retrain-friendliness advantage (36 hours single RTX 4090 + 6.5 GB VRAM total)
AC-1.2 (frame-center within 20 m, ≥50% normal-flight photos) Pass (documentary on MegaDepth-1500 + ScanNet-1500) → Verify Same as AC-1.1, tighter tail; XFeat* semi-dense AUC@10°=65.4 documentary on MegaDepth-1500 vs DISK+LightGlue 83.45 (-18 absolute) + SP+LightGlue 79.3 (-13.9 absolute) — XFeat sparse-mode is materially below LightGlue-siblings at strictest tier; XFeat+LighterGlue narrows gap
AC-2.1b (satellite-anchor registration succeeds, AC-1.1/1.2 + AC-2.2 + AC-8.2 + AC-8.6 conditions) Pass (documentary) → Verify C3's contribution is the geometric verification step; XFeat* semi-dense provides 1885 inliers per pair vs LightGlue 475 = 4× more inliers per pair per Source #81 Appendix F Table 6 — structurally-superior inlier count provides better RANSAC stability for downstream C4 PnP+RANSAC vs LightGlue-sibling sparse modes
AC-3.3 (≥3 disconnected segments via satellite-reference re-localization) Pass (per-pair stateless) XFeat per-pair geometric verification is stateless; same as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue rows
AC-4.1 (latency <400 ms p95, end-to-end camera→FC) STRONGEST EXTRAPOLATED LATENCY ADVANTAGE AMONG ALL C3 CANDIDATES EVALUATED + Verify (custom-ONNX-export-effort-required) CRITICAL POSITIVE finding for XFeat sparse: Source #81 paper Appendix C Orange Pi Zero 3 ARM Cortex-A53 ($28 device) at 480×360 input documents XFeat=1.8 FPS vs SuperPoint=0.16 FPS (11.25× faster) vs ALIKE=0.58 FPS (3.1× faster); paper explicitly states "XFeat is the ONLY learned method capable of running over 1 FPS on highly-constrained embedded device". Jetson Orin Nano Super extrapolation: XFeat sparse PyTorch-fp16 ~10-30 ms per pair (vs ALIKED's ~70-140 ms / DISK's ~200-400 ms / SP+LightGlue's ~30-60 ms). At K=10 pairs/frame extrapolated 100-300 ms total = comfortable AC-4.1 satisfaction with materially-largest-latency-margin among all C3 candidates evaluated. HOWEVER, NO PRODUCTIZED ONNX/TensorRT EXPORT PATHWAY in canonical repo (Source #80 README Contributing section explicit community-contribution ask) — project must invest custom-ONNX-export engineering effort to realize the strongest latency advantage; the architecture is straightforward (Conv + ReLU + BatchNorm only, no deform_conv2d blocker like ALIKED, no graph-neural-network attention export complexity like SuperGlue), so custom export should be technically feasible. D-C3-2 gate HARSHER than DISK+LightGlue's well-documented LightGlue-ONNX TensorRT pathway but TECHNICALLY SIMPLER than ALIKED+LightGlue's deform_conv2d ONNX-export blocker
AC-4.2 (memory <8 GB shared) Pass (with Verify) — smallest model footprint among all modern competitive C3 candidates evaluated XFeat featherweight backbone = 23 conv layers with channel sequence {4,8,24,64,64,128} = ~3-5 MB at fp16 (smallest of any C3 candidate evaluated, tied with LighterGlue; vs SP+LightGlue ~27 MB / DISK+LightGlue ~26 MB / ALIKED+LightGlue ~27 MB). 64-D descriptors (vs SP/ALIKED 256-D/128-D) provide cache footprint advantage at the canonical training time. Activations: paper §3.1 explicit "we keep the resolution as large as possible while limiting the number of channels in the network" — minimal activation memory. Co-resident memory pressure with C1/C2/C4/C5/C6 is the lowest among all C3 candidates evaluated
AC-8.1 (cache-interface resolution ≥0.5 m/px, ideally 0.3 m/px) Pass (with Verify) — resolution-agnostic at API level XFeat is resolution-agnostic at the algorithm level; canonical demo evaluates at 640 max dim (Fast) or 1024 max dim (Accurate) per Source #80 README; aerial-domain cross-resolution validation deferred to Jetson MVE phase + D-C2-1 retrain decision
AC-8.6 — Scale-ratio (any UAV-frame ground footprint at deployment altitude must be retrievable) Verify Same as SP+LightGlue scale-ratio row; XFeat* semi-dense 2-scale processing (0.65× + 1.3× resize) provides a structurally-favorable multi-scale extension vs single-scale LightGlue-sibling sparse-only methods
AC-8.6 — Scene change in active-conflict sectors Verify with structural-cross-domain-generalization-advantage CRITICAL POSITIVE finding: Source #81 paper Table 2 + Appendix E document XFeat outperforming SuperPoint+DISK+ALIKE on ScanNet-1500 indoor cross-domain transfer despite all methods being MegaDepth-trained — paper attributes this to hybrid MegaDepth+synthetic-warp-COCO training reducing landmark-dataset overfitting bias. This is the strongest documented cross-domain-generalization signal among all C3 candidates evaluated. D-C2-1 retrain decision REUSE with XFeat-strongest-retrain-friendliness advantage + XFeat's hybrid-training paradigm aligns with project's seasonal/visibility class generalization requirement
AC-8.6 — Compute & latency under steady-state and re-loc-trigger STRONGEST EXTRAPOLATED LATENCY ADVANTAGE — same AC-4.1 binding Same as AC-4.1; XFeat provides the largest latency margin among all C3 candidates evaluated at K=10 pairs/frame on Jetson Orin Nano Super extrapolation, conditional on custom-ONNX-export engineering effort to realize the advantage
AC-NEW-2 (spoofing-promotion latency <3 s p95) Pass (mechanical) with strongest latency margin XFeat single-pair latency on Jetson extrapolated ~10-30 ms vs 3 s budget = ~100-300× headroom even at K=10 pairs/frame
AC-NEW-6 (imagery freshness — never satellite_anchored on stale-tile match) Pass (mechanical) XFeat produces 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair; freshness-age decision is a downstream C5/C6 filter
AC-NEW-7 (cache-poisoning safety budget — P(>30 m geo-misalign) <1%, P(>100 m) <0.1%) Pass — STRUCTURAL geometric-verification with strongest inlier count via XFeat* semi-dense XFeat per-correspondence confidence threshold + RANSAC inlier selection provides structural geometric-verification layer — strongest inlier count per pair among all C3 candidates evaluated via XFeat* semi-dense (1885 inliers per pair vs LightGlue 475 per Source #81 Appendix F Table 6) — gives best structural cache-poisoning defense
Restriction "Operational area: eastern/southern Ukraine" — sparse-matcher train-domain match Verify (D-C2-1 reuse) — XFeat is cheapest retrain candidate among all C3 candidates evaluated CRITICAL POSITIVE finding: Source #81 paper §3.3 + Appendix B explicit "trained on a single NVIDIA RTX 4090 GPU, consuming 6.5 GB of VRAM in total, considering both training and synthetic warps done on the fly on GPU" + 36 hours total convergence; paper §3.3 explicit "low memory usage of our method enables training on entry-level hardware, facilitating the fine-tuning or full training of our network for specific tasks and scene types"; XFeat is the cheapest C3 candidate to execute D-C2-1 = (a) project-domain retrain on aerial nadir corpus — materially cheaper than DISK+LightGlue (~2 weeks 32 GB V100 per Source #77) + comparable to ALIKED+LightGlue (~24 hours RTX 3090 per Source #74) + infinitely better than SuperGlue+SuperPoint (training-code-not-released per Source #78). Hybrid MegaDepth+synthetic-warp-COCO training paradigm (paper Appendix E) provides structural cross-domain-generalization advantage that aligns with project's aerial-nadir vs phototourism-outdoor cross-domain requirement
Restriction "Altitude ≤1 km AGL; terrain assumed flat (rolling steppe / agricultural)" — sparse-matcher scale band match Verify with multi-scale advantage via XFeat* semi-dense XFeat* semi-dense 2-scale processing (0.65× + 1.3× resize) provides structural multi-scale extension; canonical operating range bounded by Source #80 README VGA-to-Megadepth-1200 evaluation extrapolated to project's 1024-largest-edge config
Restriction "Weather: predominantly sunny ... seasonal/visibility classes" — sparse-matcher cross-season generalization Verify with structural-cross-domain-generalization-advantage Same as AC-8.6 scene change row; XFeat's hybrid MegaDepth+synthetic-warp-COCO training paradigm reduces landmark-dataset overfitting bias per paper Appendix E — provides structural cross-season generalization advantage that the LightGlue-sibling-extractors do not have
Restriction "Navigation camera (pinned): ADTi 20MP, 5472×3648" Pass (API) — same downscale as canonical XFeat consumes 1024-largest-edge grayscale-or-RGB input; same downscale as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue. D-C2-3 input-resolution-shape Plan-phase decision applies identically
Restriction "Satellite Imagery — resolution ≥0.5 m/px" — sparse-matcher pipeline at AC-8.1 floor Verify Same as AC-8.1
Restriction "Satellite Imagery — Cache budget: 10 GB" — sparse-matcher cache footprint Pass — NO C3 cache footprint C3 cache footprint is exactly 0 GB — same as SP+LightGlue + DISK+LightGlue + ALIKED+LightGlue + SuperGlue+SuperPoint; XFeat operates on UAV-frame + retrieved-tile pair on-the-fly with no pre-cached match-time state
Restriction "Companion computer: Jetson Orin Nano Super, 8 GB shared" STRONGEST EXTRAPOLATED LATENCY ADVANTAGE WITH CUSTOM-ONNX-EXPORT-EFFORT-REQUIRED CAVEAT CRITICAL POSITIVE finding: Source #81 paper Appendix C Orange Pi Zero 3 ARM Cortex-A53 1.8 FPS at 480×360 input — strongest documented embedded-deployment signal among all C3 candidates evaluated. HOWEVER, NO PRODUCTIZED ONNX/TensorRT EXPORT PATHWAY in canonical repo (Source #80 README Contributing section explicit community-contribution ask) — D-C3-2 gate HARSHER than DISK+LightGlue but TECHNICALLY SIMPLER than ALIKED+LightGlue. Implication for D-C3-2: XFeat's Jetson runtime path requires custom-ONNX-export engineering effort to realize the strongest extrapolated latency advantage; the architecture is straightforward (Conv + ReLU + BatchNorm only, no deform_conv2d blocker, no graph-neural-network attention export complexity), so custom export should be technically feasible at moderate engineering cost (~1-2 weeks vs ALIKED's deform_conv2d export blocker which requires custom plugin engineering ~4-6 weeks effort). PyTorch-fp16-only fallback (~10-30 ms per pair Jetson Orin Nano Super extrapolation) STILL provides AC-4.1 satisfaction at K=10 pairs/frame
Restriction "License posture (D-C1-1)" — sparse-matcher license-track interaction POSITIVE finding (CLEAN-APACHE-2.0 license track THROUGHOUT) — TIED-CLEANEST license-compliant LightGlue-extractor-sibling alongside DISK+LightGlue POSITIVE on canonical verlab/accelerated_features: Source #80 GitHub API license metadata = Apache-2.0 (license.spdx_id: "Apache-2.0") — permissive, BSD/permissive license track. POSITIVE on XFeat+LighterGlue companion mode: Source #80 README explicit cross-cite to cvg/glue-factory + cvg/LightGlue (both Apache-2.0 per Source #70) = clean Apache-2.0 throughout. CLEAN APACHE-2.0 LICENSE TRACK THROUGHOUT — no Magic Leap noncommercial-research disqualifier (vs SP+LightGlue + SuperGlue+SuperPoint), no GPL-3.0 copyleft (vs SALAD on C2 row), no BSD-3-Clause + Apache-2.0 mixed track (vs ALIKED+LightGlue). TIED-CLEANEST license-compliant LightGlue-extractor-sibling-or-modern-competitive-lead in the project's evaluated C3 candidate space, alongside DISK+LightGlue's RECOMMENDED-PRIMARY-MITIGATION role. Under D-C1-1 = (a) GPL-3.0 track, (b) BSD/permissive lock, or (c) keep-both-tracks-open, XFeat is eligible on every license-posture choice with the cleanest license-compliance story TIED with DISK+LightGlue. D-C3-1 ALTERNATE-MODERN-COMPETITIVE-LEAD role — XFeat is the second cleanest license-compliant + structurally-different design point + materially-cheapest-retrain-cost C3 candidate alongside DISK+LightGlue's RECOMMENDED-PRIMARY. Three converging POSITIVE structural advantages: (i) CLEAN APACHE-2.0 license track THROUGHOUT (TIED with DISK+LightGlue); (ii) STRONGEST DOCUMENTED EMBEDDED-DEPLOYMENT SIGNAL (Source #81 Appendix C Orange Pi Zero 3 1.8 FPS at 480×360 input — strongest signal among all C3 candidates evaluated); (iii) STRONGEST RETRAIN-FRIENDLINESS SIGNAL (36 hours single RTX 4090 + 6.5 GB VRAM total — strongest signal among all C3 candidates evaluated). One NEGATIVE structural finding: NO PRODUCTIZED ONNX/TensorRT EXPORT PATHWAY (D-C3-2 gate HARSHER than DISK but TECHNICALLY SIMPLER than ALIKED). Recommendation: present D-C1-1 + D-C3-1 + D-C3-6 + this row to user as a structured Choose block at Plan time; XFeat is a strong ALTERNATE to DISK+LightGlue's RECOMMENDED-PRIMARY-MITIGATION role with the trade-off of (a) lower documentary AUC@5° on MegaDepth-1500 sparse mode (-7 to -25 absolute below LightGlue-siblings; XFeat+LighterGlue narrows to -2.5 to -2.8 absolute), (b) custom-ONNX-export engineering effort required (~1-2 weeks vs DISK's productized LightGlue-ONNX TensorRT pathway), and (c) materially-cheaper retrain cost (~36 hours single RTX 4090 vs DISK's ~2 weeks 32 GB V100)