E-CC-HELPERS closes with the three remaining Layer-1 helpers and E-CC-CONF closes with the env > YAML > defaults precedence test gate. All four tickets ship with frozen public surfaces, hermetic unit tests, and no upward (components.*) imports. * AZ-271 — tests/unit/shared/config/test_precedence.py (5 ACs + smoke test + helper that names the layer in failure messages). * AZ-282 — helpers/ransac_filter.py: static RansacFilter + RansacResult; cv2.setRNGSeed(0) for byte-equal determinism; median residual semantics pinned by contract. * AZ-276 — helpers/imu_preintegrator.py + make_imu_preintegrator; GTSAM PreintegratedCombinedMeasurements; strict-monotonic ts_ns guard runs before any state mutation. Adjacent hygiene: _types/nav.py ImuSample/ImuWindow now use ts_ns:int and the spec-mandated ImuBias dataclass. * AZ-278 — helpers/lightglue_runtime.py: structural R14 fix. LightGlueRuntime + non-blocking concurrent-access guard that raises rather than serialising. EngineHandle Protocol in _types/manifests.py + KeypointSet/CorrespondenceSet in _types/matching.py (Protocol surface adds approved by spec). Dependency conflict (Finding 1, user-approved): gtsam 4.2 (PyPI) is numpy-1.x-ABI only; opencv-python>=4.12 needs numpy>=2 at runtime. Resolution: opencv-python pin relaxed to >=4.11.0.86,<4.12. The D-CROSS-CVE-1 ratchet at ci/opencv_pin_gate.py is held at 4.11.0 with the original 4.12.0 floor restored once a numpy-2-compatible gtsam wheel ships. Full replay procedure in _docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md. Tests: 294 passed, 2 skipped (cmake/actionlint env-skips, pre-existing). 43 new tests added for batch 5. Ruff check + format clean. Co-authored-by: Cursor <cursoragent@cursor.com>
10 KiB
LightGlueRuntime Helper Module (R14 fix)
Task: AZ-278_lightglue_runtime
Name: LightGlueRuntime Helper
Description: Implement the shared LightGlueRuntime helper that owns the LightGlue inference engine handle for both C2.5 (single-pair inlier counting) and C3 (heavier matching pass). This is the structural fix for R14 (the original C2.5 ↔ C3 import cycle): the runtime sits at Layer 1 with no components.* imports, so the cycle becomes impossible to express. Single CUDA stream; concurrent access forbidden by contract; composition root binds to the single F3 hot-path thread.
Complexity: 3 points
Dependencies: AZ-263_initial_structure
Component: shared.helpers.lightglue_runtime (cross-cutting; epic AZ-264 / E-CC-HELPERS)
Tracker: AZ-278
Epic: AZ-264 (E-CC-HELPERS)
Document Dependencies
_docs/02_document/contracts/shared_helpers/lightglue_runtime.md— frozen public interface this task produces._docs/02_document/common-helpers/03_helper_lightglue_runtime.md— design rationale and R14 context.
Problem
C2.5 (Re-rank) and C3 (CrossDomainMatcher) both call LightGlue. In cycle 1 of _docs/02_document/epics.md, LightGlue ownership was ambiguous and produced R14: a circular import / runtime dependency between C2.5 and C3 (the "K=10 → N=3 funnel" both wanted to own the engine). Without a shared runtime:
- The engine is built / loaded twice, doubling GPU memory at takeoff (Tier-2 has only 8 GB).
- C2.5 and C3 drift on engine version pinning, producing inconsistent matches.
- Their import cycle is a recurring footgun: any future refactor will tempt one to import from the other.
Outcome
- A single
LightGlueRuntimeinstance is constructed once at takeoff by the composition root from C7'sdeserialize_engine(LIGHTGLUE_ENGINE_CACHE_ENTRY)and is constructor-injected into BOTH C2.5 and C3. - The C2.5 ↔ C3 import cycle is structurally impossible: the runtime lives at Layer 1 (
helpers/) and imports zerocomponents.*modules. Both consumers depend on the helper; neither depends on the other. - Concurrent access is rejected at runtime by an explicit guard (
LightGlueConcurrentAccessError), preserving the single-CUDA-stream invariant. The composition root binds the runtime to the single F3 hot-path thread; AC-4 of the contract is the canary that catches future composition-root mistakes. - The helper exposes no
set_*/update_*methods — once constructed, the runtime's behaviour is fixed.
Scope
Included
LightGlueRuntime(engine_handle: EngineHandle)constructor.match(features_a: KeypointSet, features_b: KeypointSet) -> CorrespondenceSet— single-pair path used by C2.5.match_batch(features_a_list, features_b_list) -> list[CorrespondenceSet]— batch path used by C3.descriptor_dim() -> intaccessor for shape validation upstream ofmatch.- Concurrent-access guard that raises
LightGlueConcurrentAccessErroron overlappingmatch/match_batchentries. LightGlueRuntimeError(construction / dim mismatch) andLightGlueConcurrentAccessError(concurrent entry) exception types.- Public interface contract published at
_docs/02_document/contracts/shared_helpers/lightglue_runtime.md.
Excluded
- Engine compilation / serialisation — C7.
- Engine filename schema —
helpers.engine_filename_schema(separate task in this epic). - Engine cache management / takeoff load — C10.
- Backbone-specific feature extraction (DISK / ALIKED / XFeat) — C3 / C7.
- Multi-GPU / multi-stream / mixed-backbone — out of scope for v1.0.0.
- The
EngineHandleProtocol itself — owned by_types/manifests.py(AZ-263) so Layer 1 can reference it without depending on C7.
Acceptance Criteria
AC-1: Single-pair match (C2.5 path)
Given a pair of KeypointSets with matching descriptor dim and a synthetic-overlap fixture
When match(features_a, features_b) runs
Then a CorrespondenceSet is returned with len > 0 and the inlier-count helper used by C2.5 finds the expected count
AC-2: Batch match (C3 path)
Given three pairs of KeypointSets
When match_batch([a1, a2, a3], [b1, b2, b3]) runs
Then three CorrespondenceSets are returned in input order; per-pair invariants match the single-pair path
AC-3: Descriptor-dim mismatch rejected
Given features whose descriptor_dim does not match the engine's expected dim
When match runs
Then LightGlueRuntimeError is raised with a message naming both the expected and actual dims
AC-4: Concurrent access rejected
Given two threads call match simultaneously on the same LightGlueRuntime instance
When the second call enters
Then LightGlueConcurrentAccessError is raised in the second thread; the first thread completes normally
AC-5: Construction-time guard
Given LightGlueRuntime(engine_handle=None)
When construction runs
Then LightGlueRuntimeError is raised mentioning engine_handle
AC-6: No upward imports — R14 structural fix
Given the helper module
When a static-import check runs across gps_denied_onboard.helpers.lightglue_runtime
Then it imports ONLY from _types, numpy, and stdlib — NO imports from gps_denied_onboard.components.* (verified by importlinter or grep gate in CI)
AC-7: Determinism downstream of the engine
Given the same (features_a, features_b) pair matched twice with the same engine_handle
When match runs both times
Then both CorrespondenceSet outputs are byte-equal (engine determinism is a C7 concern; this AC asserts the helper itself adds no non-determinism)
Non-Functional Requirements
Performance
matchp99 ≤ 30 ms on Tier-2 with the production DISK+LightGlue engine on a typical K=10 candidate pair (matches the per-frame budget for C2.5's K=10 → N=3 funnel).- Helper-level overhead (excluding the engine call itself) ≤ 100 µs — verified via a benchmark that swaps in a stub engine handle.
Reliability
LightGlueRuntimeErrorandLightGlueConcurrentAccessErrorare the ONLY exception types the public surface raises. Engine-internal exceptions MUST be wrapped.- Pure-deterministic given a deterministic engine; the helper itself adds no random state.
Concurrency
- Single-thread by contract. The concurrent-access guard is the runtime invariant detector — any composition-root regression that wires the runtime into multiple threads is caught immediately rather than producing GPU memory corruption.
Unit Tests
| AC Ref | What to Test | Required Outcome |
|---|---|---|
| AC-1 | single-pair match on synthetic-overlap fixture | non-empty CorrespondenceSet |
| AC-2 | batch of 3 pairs | three results in input order; per-pair invariants match AC-1 |
| AC-3 | dim-mismatched features | LightGlueRuntimeError; message names expected & actual dims |
| AC-4 | two threads call match simultaneously |
one succeeds; the second raises LightGlueConcurrentAccessError |
| AC-5 | construct with engine_handle=None |
LightGlueRuntimeError |
| AC-6 | importlinter / grep gate over helpers/lightglue_runtime.py |
no components.* imports |
| AC-7 | same pair matched twice | byte-equal outputs (with deterministic stub engine) |
| NFR-perf | microbench match overhead with stub engine (10k iterations on Tier-2 fixture) |
helper overhead ≤ 100 µs |
Constraints
- Public surface frozen by
_docs/02_document/contracts/shared_helpers/lightglue_runtime.mdv1.0.0. - Layer 1 Foundation only. NO upward imports — this is the load-bearing constraint for the R14 fix.
- The
EngineHandleProtocol must be defined in_types/manifests.py(AZ-263 / E-BOOT) so this helper can reference it without importing C7. If_types/manifests.pydoes not yet define the Protocol surface (forward(...),descriptor_dim), this task adds it — that is the only_typesedit allowed by this task. - No new dependency beyond what AZ-263 / E-BOOT pinned.
Risks & Mitigation
Risk 1: Composition root accidentally creates two runtimes (one for C2.5, one for C3)
- Risk: Future composition-root refactor instantiates
LightGlueRuntimetwice; engine memory doubles, behaviour drifts. - Mitigation: The composition-root contract test (E-CC-CONF / AZ-246, AZ-269/AZ-270 in scope) already verifies cardinality of cross-cutting helpers. This task's contract documents that EXACTLY ONE instance is expected; the composition-root validator is the enforcement point.
Risk 2: Concurrent-access guard introduces hot-path overhead
- Risk: A naive
threading.Lockon everymatchcall adds 100s of µs. - Mitigation: The guard uses a non-blocking
threading.local()-style check or aLock(blocking=False).acquire()pattern that simply RAISES on contention rather than serialising callers — the contract is "concurrent calls are a bug", not "serialise concurrent callers". NFR-perf microbench validates the overhead budget.
Risk 3: A future backbone needs a different match shape
- Risk: A new feature backbone produces 5-tuple correspondences instead of the current 4-tuple (e.g., adds confidence per match).
- Mitigation: The contract version bump path is documented (
Versioning Rulessection). Adding a field is non-breaking IF consumers tolerate the extra field; otherwise it is a major-version contract change with a deprecation pass.
Runtime Completeness
- Named capability: shared LightGlue inference runtime with single-CUDA-stream guarantee + R14 structural cycle fix (architecture / E-CC-HELPERS /
03_helper_lightglue_runtime.md). - Production code that must exist: real
EngineHandle-backed match dispatch; real concurrent-access guard; real descriptor-dim validation. - Allowed external stubs: a deterministic stub
EngineHandleis allowed in tests (and recommended for AC-7 determinism) but production paths use C7's real engine. - Unacceptable substitutes: bypassing the concurrent-access guard with
threading.Lock(silently serialising callers); allowing each consumer to construct its own runtime; reintroducing a C2.5 → C3 (or C3 → C2.5) import to "share state". Any of those reintroduces R14.
Contract
This task produces the contract at _docs/02_document/contracts/shared_helpers/lightglue_runtime.md.
Consumers MUST read that file — not this task spec — to discover the interface.