Decompose Step 6 snapshot: 140 task specs + contract docs

Closes out greenfield Step 6 (Decompose) for all 14 components
(C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446
plus the _dependencies_table.md and component contract documents.

State file updated to greenfield Step 7 (Implement), not_started.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-11 00:39:48 +03:00
parent 8171fcb29e
commit 880eabcb3f
172 changed files with 22897 additions and 35 deletions
@@ -0,0 +1,170 @@
# Contract: `ConditionalRefiner` Protocol
**Owner**: c3_5_adhop (epic AZ-258 / E-C3.5)
**Producer task**: AZ-348 (Protocol + factory + DTOs + composition + `PassthroughRefiner`)
**Consumer tasks**: AZ-349 (`AdHoPRefiner` real refinement); downstream c4_pose (epic AZ-259) which consumes the (possibly refined) `MatchResult`
**Version**: 1.0.0
**Status**: draft, awaiting Producer task implementation
**Last Updated**: 2026-05-10
**Module-layout home**: `src/gps_denied_onboard/components/c3_5_adhop/interface.py` (Protocol), `src/gps_denied_onboard/components/c3_5_adhop/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/refiner_factory.py` (factory)
> **Public API symbol naming.** The component's public interface symbol is named `ConditionalRefiner` in `description.md` § 2 and `AdHoPRefinementStrategy` in `module-layout.md` § c3_5_adhop. Both refer to the SAME Protocol; the canonical class name in code is `ConditionalRefiner` — it is the role description-first name and matches the method `refine_if_needed`. The producer task ALSO updates `module-layout.md` to align (`AdHoPRefinementStrategy` → `ConditionalRefiner`) so the two documents agree.
## Purpose
Defines the public interface for every C3.5 refinement strategy: `refine_if_needed(frame, mr, residual_threshold_px)` returns a `MatchResult` that is either (a) the input unchanged ("passthrough") OR (b) enriched with refined inlier correspondences from OrthoLoC AdHoP perspective preconditioning. The conditional gate is a configurable residual threshold: if the input `MatchResult.reprojection_residual_px` ≤ threshold the refiner returns the input unchanged; otherwise the refiner runs the AdHoP backbone and returns an enriched `MatchResult`. `was_invoked()` exposes the last-call decision for FDR provenance and for NFT-PERF-01 invocation-rate accounting.
Two concrete strategies are linked into the production binary by default: `AdHoPRefiner` (production-default; conditional invocation) and `PassthroughRefiner` (always passes through; non-conditional baseline used by smoke tests and by IT-12's "no refinement" comparison). Both implementations co-exist at build time per ADR-001 — gating is at runtime via `config.refiner.strategy`. Build-time exclusion (ADR-002) is NOT used here because both strategies are tiny (passthrough is a no-op; AdHoP's backbone is a single TRT engine shared with C7).
The shared `RansacFilter` helper (AZ-282) is constructor-injected — `c3_5_adhop` imports the SAME helper used by `c3_matcher` and `c4_pose`; the runtime root constructs ONE instance and identity-shares it across all three components.
## Public API
### Protocol: `ConditionalRefiner`
```python
from typing import Protocol, runtime_checkable
from gps_denied_onboard._types import (
NavCameraFrame, MatchResult,
)
@runtime_checkable
class ConditionalRefiner(Protocol):
"""Conditional refinement strategy invoked between C3 (matcher) and C4 (pose). Stateless per-frame; the only persistent state is the constructor-injected backbone runtime handle + the last-invocation flag."""
def refine_if_needed(
self,
frame: NavCameraFrame,
mr: MatchResult,
residual_threshold_px: float,
) -> MatchResult:
"""If `mr.reprojection_residual_px <= residual_threshold_px` (the steady-state path), return `mr` unchanged AND set `was_invoked()` to False. Otherwise, run the strategy's refinement procedure and return an enriched `MatchResult` with `refinement_label` set, AND set `was_invoked()` to True.
On `RefinerBackboneError` (AdHoP backbone failure during the invoked path), the refiner MUST fall through to passthrough — return `mr` unchanged with `refinement_label = "passthrough"` AND `was_invoked()` = True (the attempt counts towards the invocation rate even on failure). The error is logged at ERROR level + emitted to FDR; downstream pose estimation may then trigger F6 satellite re-localisation if quality gates fail.
Determinism: same inputs MUST produce the same output. The conditional gate is a `<=` comparison only — no probabilistic gating, no time-based gating.
"""
...
def was_invoked(self) -> bool:
"""Return True iff the last call to `refine_if_needed` actually entered the refinement procedure (regardless of whether it produced a refined result or fell through to passthrough on backbone error). Reset to False at the start of every `refine_if_needed` call. Used by FDR per-frame provenance and by NFT-PERF-01 / C3.5-IT-03 invocation-rate accounting."""
...
```
**Invariants**:
1. **Single-threaded by contract** — each instance is bound to one ingest thread (composition root enforces; same thread as C3 because they share the C-frame ingest path).
2. **Stateless per-frame for `refine_if_needed`** — except for the `was_invoked()` flag, no implicit dependency on prior frames; reordering `refine_if_needed` calls (tests only) MUST yield identical output `MatchResult` content.
3. **Conditional gate is a pure comparison**`mr.reprojection_residual_px <= threshold` → passthrough; `>` → invoke. No tolerance, no smoothing, no hysteresis. The threshold is a parameter (NOT a hidden internal constant) so operator tooling can tune pre-flight per AC-NEW-5 / R10.
4. **Passthrough fall-through on backbone error**`RefinerBackboneError` raised inside the invoked path is caught by the strategy and converted to passthrough output (input `MatchResult` returned unchanged with `refinement_label = "passthrough"`); the error is logged at ERROR level. The exception is NEVER re-raised out of `refine_if_needed` (downstream pose estimation gets a usable `MatchResult` and decides whether to trigger F6).
5. **Bit-identical correspondences on passthrough** — when `refinement_label == "passthrough"`, every `inlier_correspondences` ndarray in the output equals the input ndarray bit-for-bit (`np.array_equal` AND same dtype). Refinement may NEVER silently rewrite correspondences when the gate decided not to invoke.
6. **`refinement_label` is `"adhop"` OR `"passthrough"`** — exactly one of those two values; matches the strategy's selected variant. The label distinguishes "AdHoP ran successfully" from "passthrough or AdHoP-fell-through-to-passthrough"; readers check `was_invoked()` for the latter discrimination.
7. **`refinement_added_latency_ms` is the STRATEGY-INTERNAL added latency** — not the matcher's or pose estimator's; covers exactly the work done inside `refine_if_needed`. Always ≥ 0; near-zero on passthrough; up to ~90 ms on AdHoP invoke per AC C3.5-PT-01.
8. **`was_invoked()` semantics** — set to True iff the strategy entered the refinement procedure (post-gate, regardless of whether AdHoP succeeded or fell through). On passthrough strategy + every gate-decided-passthrough call: False.
9. **Threshold validation** — the strategy MUST reject `residual_threshold_px <= 0` (raise `ValueError`); the composition root validates the config-loaded threshold at startup so this in-method check is defensive.
### DTOs (in `_types/refiner.py` — additions; reuse `MatchResult` from `_types/matcher.py`)
The output of `refine_if_needed` is a `MatchResult` (same DTO as C3 produces) with the following NEW optional fields populated by C3.5:
```python
# Additions to existing MatchResult in _types/matcher.py (NOT a new DTO; in-place extension)
@dataclass(frozen=True, slots=True)
class MatchResult:
# ... existing fields from C3 ...
# NEW (populated by C3.5; default values for non-refined frames):
refinement_label: str = "passthrough" # "adhop" | "passthrough"
refinement_added_latency_ms: float = 0.0 # added latency due to refinement; 0 on pure passthrough
```
Rationale: `MatchResult` is consumed by C3 producers and C3.5 (which may rewrite); since `MatchResult` is a frozen dataclass, C3.5 produces a NEW `MatchResult` instance via `dataclasses.replace(...)` whenever it enriches. The new fields default to the passthrough values so a C3 producer that never goes through C3.5 still yields a valid downstream-readable `MatchResult`.
> **Cross-task coordination.** AZ-344 (C3 Protocol task) defines the `MatchResult` DTO with the C3 fields. The C3.5 Producer task (TBD) extends `MatchResult` with the two NEW fields (with their defaults) in the SAME `_types/matcher.py` file. Because the fields default to passthrough values, the addition is backward-compatible for AZ-344's tests; AZ-344's `MatchResult` constructor stays valid. The C3.5 Producer task is responsible for updating AZ-344's frozen-dataclass tests (if any) to assert the new field defaults.
### Error hierarchy (in `c3_5_adhop/errors.py`)
```python
class RefinerError(Exception):
"""Base class for all C3.5 refinement-strategy errors."""
class RefinerBackboneError(RefinerError):
"""AdHoP backbone forward failed (TensorRT exception, OOM, NaN, shape mismatch). Caught inside `refine_if_needed`; converted to passthrough fall-through; never re-raised out of the strategy."""
class RefinerConfigError(RefinerError):
"""Composition-root rejected the refiner config (unknown strategy, invalid threshold). Raised at startup ONLY; never per-frame."""
```
The error hierarchy is intentionally small — drop-and-continue at the C3 matcher level handles per-candidate failures already; at C3.5 the only failure mode is the AdHoP backbone, and it is contained within the strategy via passthrough fall-through (Invariant 4).
### Composition-root factory
```python
# In src/gps_denied_onboard/runtime_root/refiner_factory.py
from gps_denied_onboard._types import config
from gps_denied_onboard.helpers.ransac_filter import RansacFilter
from gps_denied_onboard.components.c7_inference.interface import InferenceRuntime
from gps_denied_onboard.components.c3_5_adhop.interface import ConditionalRefiner
def build_refiner_strategy(
config: config.AppConfig,
ransac_filter: RansacFilter,
inference_runtime: InferenceRuntime,
) -> ConditionalRefiner:
"""Construct the configured C3.5 strategy at composition-root time. Selects between `AdHoPRefiner` and `PassthroughRefiner` per `config.refiner.strategy`. Both strategies are imported eagerly (no `BUILD_REFINER_*` flag gating — both are linked unconditionally) — runtime selection only.
Raises:
RefinerConfigError: unknown strategy name OR invalid threshold (≤ 0).
"""
...
```
Strategy resolution table:
| `config.refiner.strategy` | Module path | Class | Notes |
|---|---|---|---|
| `"adhop"` | `gps_denied_onboard.components.c3_5_adhop.adhop_refiner` | `AdHoPRefiner` | production-default; conditional invocation. |
| `"passthrough"` | `gps_denied_onboard.components.c3_5_adhop.passthrough_refiner` | `PassthroughRefiner` | always-passthrough; baseline / smoke / IT-12 comparison. |
Config-load-time validation (in AZ-269):
- `config.refiner.strategy` (enum, required): `"adhop"` | `"passthrough"`.
- `config.refiner.residual_threshold_px` (float, default `2.5`): must be > 0.
- `config.refiner.invocation_rate_warn_threshold` (float, default `0.25`): rolling-60s threshold above which a WARN log is emitted (per description.md § 9). Must be in `(0, 1)`.
## Test expectations summarised by Invariant
| Invariant | Test name | Assertion |
|---|---|---|
| 1 | thread-binding | composition root binds the strategy to ONE ingest thread; second binding raises `RuntimeError`. |
| 2 | stateless reorder | shuffle 10 frames → output content identical to in-order pass; `was_invoked()` flags identical positionwise. |
| 3 | gate semantics | residual = threshold → passthrough (`<=` is inclusive); residual = threshold + 1e-6 → invoked. |
| 4 | backbone-error fall-through | monkey-patch backbone to raise `RefinerBackboneError`; `refine_if_needed` returns input unchanged with `refinement_label = "passthrough"`; ERROR log emitted; `was_invoked()` is True. |
| 5 | bit-identical on passthrough | when `refinement_label == "passthrough"`, every `inlier_correspondences` array satisfies `np.array_equal(out, in_) and out.dtype == in_.dtype`. |
| 6 | label values | every output's `refinement_label` is in `{"adhop", "passthrough"}`. |
| 7 | added-latency monotonic | every output's `refinement_added_latency_ms >= 0`; passthrough p95 ≤ 0.5 ms; AdHoP-invoked p95 ≤ 90 ms. |
| 8 | `was_invoked()` semantics | gate-passthrough: False; AdHoP-success: True; AdHoP-fall-through: True; PassthroughRefiner: always False. |
| 9 | threshold validation | `residual_threshold_px = 0``ValueError` raised by the strategy; `RefinerConfigError` raised by `build_refiner_strategy` at startup. |
## What this contract does NOT define
- The AdHoP TRT engine compile path — owned by AZ-321 (engine compiler).
- The AdHoP forward pass implementation — owned by C7 `InferenceRuntime` consumers.
- The `RansacFilter` API — owned by AZ-282; this contract only consumes it.
- The downstream pose estimator's behaviour when `refinement_added_latency_ms` is high — owned by E-C4 (D-CROSS-LATENCY-1 hybrid is C4-internal).
## Producer-task / consumer-task split
- The Protocol task (TBD) ships: Protocol, DTO extension to `MatchResult`, error hierarchy, composition-root factory, config schema extension, AND the `PassthroughRefiner` (because it is a 1-pt no-op that naturally accompanies the Protocol task and acts as the reference implementation for tests).
- The AdHoPRefiner task (TBD) ships: `AdHoPRefiner` only (TRT engine load, perspective preconditioning, conditional gate, backbone-error fall-through to passthrough). Composition-root wiring path for `config.refiner.strategy = "adhop"`.
## Versioning + change policy
- Protocol method-signature changes (signatures of `refine_if_needed` or `was_invoked`) are MAJOR-version bumps. Every concrete strategy must be updated lockstep.
- DTO field additions (e.g., a future `refinement_iterations: int`) are MINOR. Field removals are MAJOR.
- Adding a third strategy (e.g., a learned-conditional refiner) is a feature-cycle change; it adds an entry to the resolution table without changing this contract's surface.