mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-23 02:21:12 +00:00
Decompose Step 6 snapshot: 140 task specs + contract docs
Closes out greenfield Step 6 (Decompose) for all 14 components (C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446 plus the _dependencies_table.md and component contract documents. State file updated to greenfield Step 7 (Implement), not_started. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,140 @@
|
||||
# C3 XFeat Alternate Lightweight Matcher
|
||||
|
||||
**Task**: AZ-347_c3_xfeat
|
||||
**Name**: C3 XFeat Alternate Lightweight Matcher
|
||||
**Description**: Implement `XFeatMatcher`, the lightweight alternate `CrossDomainMatcher`. XFeat combines feature extraction AND matching in a single forward pass (no separate LightGlue stage); selectable via `config.matcher.strategy = "xfeat"`. Target use case: low-power / thermal-throttled scenarios where DISK+LightGlue's combined cost (~180 ms p95) exceeds the C4 hybrid's degraded budget. Drop-and-continue + below-threshold + best-candidate selection contracts inherited from the Protocol unchanged. RANSAC + median residual still computed via the shared `RansacFilter`.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-344 (Protocol + factory + DTOs + errors + RollingHealthWindow), AZ-263_initial_structure, AZ-269_config_loader, AZ-282_ransac_filter, AZ-298_c7_tensorrt_runtime, AZ-299_c7_onnxrt_fallback, AZ-303_c6_storage_interfaces, AZ-281_engine_filename_schema (XFeat engine self-describing filename), AZ-321_c10_engine_compiler (XFeat engine compile path), AZ-266_log_module, AZ-272_fdr_record_schema
|
||||
**Component**: c3_matcher (epic AZ-257 / E-C3)
|
||||
**Tracker**: AZ-347
|
||||
**Epic**: AZ-257 (E-C3)
|
||||
|
||||
### Document Dependencies
|
||||
|
||||
- `_docs/02_document/contracts/c3_matcher/cross_domain_matcher_protocol.md` — Protocol contract.
|
||||
- `_docs/02_document/components/04_c3_matcher/description.md` — § 1 XFeat alternate (lightweight); § 5 error handling; § 9 logging.
|
||||
- `_docs/02_document/module-layout.md` — `BUILD_MATCHER_XFEAT` row.
|
||||
- `_docs/02_document/contracts/shared_helpers/ransac_filter.md` — RANSAC filtering API.
|
||||
- `_docs/02_document/contracts/c2_5_rerank/rerank_strategy_protocol.md`.
|
||||
- `_docs/02_document/contracts/c7_inference/inference_runtime_protocol.md`.
|
||||
|
||||
## Problem
|
||||
|
||||
Without this task: there is no lightweight matcher option for thermal-throttled scenarios; if the C4 hybrid switches to Jacobian (per ADR-006 / D-CROSS-LATENCY-1) but C3's per-frame budget still allows the heavy DISK+LightGlue path, the system has no mechanism to reduce C3's cost too. XFeat is also the documented mandatory simple-baseline alternative for IT-12 comparative study (AC-2.1a engine rule applied at the matcher level, with NetVLAD acting at the VPR level).
|
||||
|
||||
## Outcome
|
||||
|
||||
- `src/gps_denied_onboard/components/c3_matcher/xfeat.py` defining:
|
||||
- `XFeatMatcher` class implementing the `CrossDomainMatcher` Protocol.
|
||||
- Constructor: `__init__(self, runtime: InferenceRuntime, ransac_filter: RansacFilter, fdr_client: FdrClient, health_window: RollingHealthWindow, config: MatcherConfig)`. Note: NO `lightglue_runtime` argument — XFeat does not use LightGlue.
|
||||
- `match(frame, rerank_result, calibration)`:
|
||||
1. Decode + preprocess the nav-camera frame ONCE.
|
||||
2. For each `RerankCandidate` in `rerank_result.candidates`:
|
||||
a. Decode + preprocess the candidate tile.
|
||||
b. Run XFeat forward (single pass: outputs combined `correspondences` directly — XFeat fuses extraction + matching).
|
||||
c. On failure: drop-and-continue (`MatcherBackboneError`, `phase="xfeat_forward"`).
|
||||
d. RANSAC + median residual via `ransac_filter.filter(correspondences, threshold_px=...)` — same helper as DISK+LightGlue.
|
||||
e. Append `CandidateMatchSet` if survivors > 0.
|
||||
3. Below-threshold / all-failed → `InsufficientInliersError` (same semantics as AZ-345).
|
||||
4. Sort survivors descending by `inlier_count`; ties broken by `per_candidate_residual_px` ascending.
|
||||
5. WARN on residual above threshold; INFO on ready; FDR `matcher.frame_done` per frame.
|
||||
6. `RollingHealthWindow.update` after each frame (success or failure).
|
||||
7. `matcher_label = "xfeat"`.
|
||||
- Module-level `create(config, lightglue_runtime, ransac_filter, inference_runtime, health_window) -> CrossDomainMatcher`:
|
||||
1. `lightglue_runtime` is accepted in the signature for factory uniformity but NOT stored / used.
|
||||
2. `xfeat_weights_path = config.matcher.xfeat_weights_path` (TRT engine produced by AZ-321).
|
||||
3. Load XFeat engine via `inference_runtime.load_engine(...)`.
|
||||
4. Construct `XFeatMatcher(...)`.
|
||||
- Composition-root wiring path for `config.matcher.strategy == "xfeat"`.
|
||||
- `BUILD_MATCHER_XFEAT` flag wiring (ON in research; ON in airborne if config selects it; OFF in operator-tooling).
|
||||
- All logging + FDR records identical structure to AZ-345 with `matcher_label = "xfeat"`.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- `XFeatMatcher` implementation per the `CrossDomainMatcher` Protocol.
|
||||
- XFeat forward via C7 `InferenceRuntime`.
|
||||
- RANSAC + median residual via shared `RansacFilter` (NO LightGlue).
|
||||
- Same drop-and-continue + below-threshold + best-candidate selection as AZ-345.
|
||||
- Same `RollingHealthWindow.update` invocation pattern.
|
||||
- Composition-root wiring path.
|
||||
- XFeat-specific preprocessor inline.
|
||||
- Unit tests covering Invariants 1–9 + drop-and-continue + below-threshold + deterministic ordering. Parametrised across XFeat-specific test fixtures (lightweight model output is different shape from DISK).
|
||||
- `BUILD_MATCHER_XFEAT` flag wiring.
|
||||
|
||||
### Excluded
|
||||
- The Protocol + DTOs + errors + factory + `RollingHealthWindow` — owned by AZ-344.
|
||||
- `RansacFilter` (AZ-282).
|
||||
- `LightGlueRuntime` (AZ-278) — XFeat does NOT consume this helper; the factory's signature includes it for uniformity but XFeat's `create` ignores the parameter.
|
||||
- C7 runtime stack (AZ-297..AZ-300).
|
||||
- XFeat engine compile (AZ-321).
|
||||
- Component-internal acceptance tests beyond Protocol + invariants smoke.
|
||||
- DISK matcher (AZ-345) and ALIKED matcher (AZ-346).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1 through AC-10**: identical contract to AZ-345 AC-1..AC-10 (Protocol conformance, best-candidate selection, drop-and-continue, below-threshold, residual WARN, health update, correspondences shape, determinism). `matcher_label = "xfeat"`.
|
||||
|
||||
**AC-11: Composition-root wiring**
|
||||
Given `config.matcher.strategy = "xfeat"` AND `BUILD_MATCHER_XFEAT = ON`
|
||||
When `compose_root(config)` runs
|
||||
Then an `XFeatMatcher` instance is wired; ONE INFO log `kind="c3.matcher.ready"` with `{strategy: "xfeat", ...}` is emitted. The strategy does NOT hold a reference to `LightGlueRuntime` (verifiable via `not hasattr(strategy, "_lightglue_runtime")` OR `strategy._lightglue_runtime is None`).
|
||||
|
||||
**AC-12: FDR `matcher.frame_done` per frame**
|
||||
Same shape as AZ-345 AC-12 with `matcher_label = "xfeat"`.
|
||||
|
||||
**AC-special-1: XFeat single-pass forward — no LightGlue call**
|
||||
Given a `match(...)` call where the `LightGlueRuntime` test double is provided to the factory
|
||||
When the call completes
|
||||
Then `lightglue_runtime.match_*` is NEVER invoked (verified by mock assertion `lightglue_runtime.match_pair.assert_not_called()`).
|
||||
|
||||
**AC-special-2: XFeat lower latency than DISK+LightGlue (informational, not gated)**
|
||||
Given identical hardware and identical inputs
|
||||
When `match(...)` is microbenchmarked × 100 frames
|
||||
Then XFeat's per-call p95 is < AZ-345's per-call p95 (informational metric; if XFeat is NOT faster, that's a backbone misconfiguration, not a contract violation. Documented in the test report; does NOT block this AC).
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance** (deferred to C3-PT-01):
|
||||
- `match` p95 ≤ 100 ms (informational target; XFeat is the lightweight option). NOT a hard gate; the hard gate is C3-PT-01's overall envelope.
|
||||
- GPU memory ≤ 300 MB (XFeat single engine; smaller than DISK+LightGlue).
|
||||
|
||||
**Compatibility**
|
||||
- XFeat engine file format owned by C10 + C7.
|
||||
|
||||
**Reliability**
|
||||
- Same as AZ-345.
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|--------------|------------------|
|
||||
| AC-1..AC-10 | Identical to AZ-345 AC-1..AC-10 with `matcher_label = "xfeat"` | Same outcomes |
|
||||
| AC-11 | `compose_root(config="xfeat")` | Wired; INFO log; no LightGlue dependency |
|
||||
| AC-12 | FDR `frame_done` emission | Correct fields; `matcher_label = "xfeat"` |
|
||||
| AC-special-1 | LightGlue NOT invoked | `lightglue_runtime.match_pair.assert_not_called()` |
|
||||
| AC-special-2 | Latency comparison | (Informational; not gated) |
|
||||
|
||||
## Constraints
|
||||
|
||||
- **Same constraints as AZ-345** — drop-and-continue mandatory, median residual, constructor injection, helpers constructor-injected, engine load at `create` time, `RollingHealthWindow.update` called exactly once per `match`.
|
||||
- **`LightGlueRuntime` is NOT consumed** — the factory's `create` signature accepts it for uniformity (so AZ-344's factory can call all three matchers' `create` with the same args) but XFeatMatcher does NOT store or use it. Test AC-special-1 enforces this.
|
||||
- **XFeat-specific preprocessing parameters are hard-coded** (weights-coupled, same rule as DISK and ALIKED).
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: XFeat output schema differs from DISK+LightGlue output (correspondences format)**
|
||||
- *Mitigation*: XFeat outputs `correspondences` ndarray of shape `(M, 4)` with columns `(px_query, py_query, px_tile, py_tile)` — same as the post-LightGlue output of DISK+LightGlue. The shared `RansacFilter` consumes this format identically. If XFeat's upstream output differs, this task adapts inside the strategy.
|
||||
|
||||
**Risk 2: XFeat's RANSAC inlier counts may be systematically lower** (lighter-weight model produces noisier matches)
|
||||
- *Mitigation*: AC-2.1a engine rule applies (XFeat is the simple baseline at the matcher level); the ≥ 80 inlier count floor (AC-1.1) may not hold for XFeat. C3-IT-01 measures this; if XFeat fails AC-1.1 on Derkachi, it remains as the "engine rule" comparison baseline NOT the production-default — same engine-rule semantics as NetVLAD at C2.
|
||||
|
||||
**Risk 3: Linking three backbones into one binary inflates GPU memory headroom**
|
||||
- *Mitigation*: per ADR-002 / D-C7-13, only the SELECTED backbone's engine is loaded at `create` time. Linking does NOT load engines; loading happens lazily in each backbone's `create`. The factory only invokes ONE `create` per binary lifetime.
|
||||
|
||||
## Runtime Completeness
|
||||
|
||||
- **Named capability**: `XFeatMatcher` — alternate lightweight `CrossDomainMatcher` (architecture / E-C3 / `solution.md` / AC-2.1a engine rule at matcher level).
|
||||
- **Production code that must exist**: real `XFeatMatcher` calling real C7 `InferenceRuntime` with real TRT-compiled XFeat engine; real shared `RansacFilter` for inlier filtering + median residual; real `RollingHealthWindow.update` after each frame; real composition-root wiring.
|
||||
- **Allowed external stubs**: `FakeInferenceRuntime`, `FakeRansacFilter`, `FakeFdrClient`, `FakeLightGlueRuntime` (passed but unused).
|
||||
- **Unacceptable substitutes**: a Python+NumPy XFeat forward (would not satisfy the lightweight-target latency); using a different RANSAC implementation; storing/calling `LightGlueRuntime` (would defeat XFeat's single-pass design).
|
||||
Reference in New Issue
Block a user