From 360aece7a6b782788963318a047f0cbf5a6b3cd6 Mon Sep 17 00:00:00 2001
From: Oleksandr Bezdieniezhnykh <oleksandr.bezdieniezhnykh@pwc.com>
Date: Thu, 14 May 2026 04:12:47 +0300
Subject: [PATCH] [AZ-528] [AZ-335] [AZ-345..AZ-347] [AZ-349] Cumulative review
 55-57
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Cumulative code review for the C3 / C3.5 cross-domain matching
pipeline going live (B55 facade-spine consolidation, B56 warm-start
+ F8 reboot recovery, B57 three concrete matchers + AdHoP refiner).
Verdict PASS_WITH_WARNINGS — three Low findings, no Critical / High
/ Architecture issues. Cumulative-52-54 Medium F1 (c1_vio
facade-spine duplication) closed by AZ-528 with regression guards.

State: last_completed_batch=57, last_cumulative_review=batches_55-57,
current_batch=58.

Co-authored-by: Cursor <cursoragent@cursor.com>
---
 ...tive_review_batches_55-57_cycle1_report.md | 265 ++++++++++++++++++
 _docs/_autodev_state.md                       |   6 +-
 2 files changed, 268 insertions(+), 3 deletions(-)
 create mode 100644 _docs/03_implementation/cumulative_review_batches_55-57_cycle1_report.md

diff --git a/_docs/03_implementation/cumulative_review_batches_55-57_cycle1_report.md b/_docs/03_implementation/cumulative_review_batches_55-57_cycle1_report.md
new file mode 100644
index 0000000..d7289f2
--- /dev/null
+++ b/_docs/03_implementation/cumulative_review_batches_55-57_cycle1_report.md
@@ -0,0 +1,265 @@
+# Cumulative Code Review — Batches 55-57 (Cycle 1)
+
+**Date**: 2026-05-14
+**Range**: batches 55 (AZ-528 — `c1_vio` facade-spine consolidation), 56 (AZ-335 — C1 warm-start + F8 reboot recovery), 57 (AZ-345/346/347/349 — C3 concrete matchers + C3.5 AdHoP refiner)
+**Compared against**: previous cumulative review batches 52-54
+**Verdict**: **PASS_WITH_WARNINGS**
+
+## Scope
+
+40 files changed across batches 55-57 (`git diff --stat 21cef8b..HEAD`):
+
+- 21 src/ files (10 new, 11 modified) across `c1_vio`, `c3_matcher`,
+  `c3_5_adhop`, `runtime_root`, `fdr_client`.
+- 5 test files (3 new, 2 modified).
+- 6 doc files (3 batch reports + 3 review reports).
+- 6 _docs / housekeeping (state file, dependencies table, task moves).
+
+The trio is **the C-3/C-3.5 cross-domain matching pipeline going live**:
+- B55 closed the cumulative-review-49-51 follow-up (3-way duplication
+  in c1_vio strategy facades — AZ-528).
+- B56 added warm-start hint persistence + F8 reboot recovery wiring,
+  which both producers (any concrete `VioStrategy`) and consumers
+  (operator orchestrator + composition root) now exercise.
+- B57 made C3 (3 strategies) and C3.5 (production-default refiner)
+  runnable end-to-end against real engine handles; the AZ-344 +
+  AZ-348 factories now have real strategy targets for every
+  configured `config.matcher.strategy` and `config.refiner.strategy`
+  value.
+
+## Carry-over closure from cumulative review 52-54
+
+| Prior finding | Status | Notes |
+|---------------|--------|-------|
+| F1 (Medium) — c1_vio strategy facade spine duplicated 3-way | **CLOSED by AZ-528 / Batch 55** | `_facade_spine.py` lands the shared helpers + `FacadeSpine` mixin; `okvis2.py`, `vins_mono.py`, `klt_ransac.py` now import from the spine. Regression guards added in `tests/unit/c1_vio/test_az528_facade_spine.py` (AST walk + import grep + behaviour parity). |
+| F2 (Low) — c1_vio test fakes not yet shared | **OPEN — carry forward** | No code touched this in B55-57. Still no active drift; KLT/RANSAC and the OKVIS2+VINS-Mono pair use genuinely different abstractions (monkeypatch vs fake backend). Continues to be a 1-2 point future hygiene pass. |
+
+## Findings (this window)
+
+| # | Severity | Category        | File:Line | Title |
+|---|----------|-----------------|-----------|-------|
+| F1 | Low      | Maintainability | `c3_matcher/_engine_output_assertion.py:60` vs `c2_vpr/_engine_dim_assertion.py:86` | Two parallel engine-output-probe helpers with different probe dtypes (FP32 vs FP16) — pattern is consistent but the probe-dtype divergence creates a footgun for future keypoint-backbone work |
+| F2 | Low      | Style           | `c3_matcher/xfeat.py:33-40` | `xfeat.py` imports four underscore-prefixed symbols from sibling `_pipeline.py` |
+| F3 | Low      | Scope           | AZ-347 AC-special-2 | Latency comparison (XFeat < DISK+LightGlue p95) flagged in B57 per-batch review; carried forward to keep cross-batch traceability |
+
+## Finding Details
+
+### F1: Two parallel engine-output-probe helpers (Low / Maintainability)
+
+- **Locations**:
+  - `src/gps_denied_onboard/components/c2_vpr/_engine_dim_assertion.py:86`
+    (introduced by AZ-527 / batch 52)
+  - `src/gps_denied_onboard/components/c3_matcher/_engine_output_assertion.py:60`
+    (introduced by AZ-346 / batch 57)
+
+- **Description**: Both helpers are dry-run engine probes invoked at
+  strategy `create()` time. C2 VPR backbones emit a single
+  descriptor tensor at `output_key`; C3 keypoint backbones emit
+  `keypoints` + `descriptors`. The two contracts genuinely differ —
+  this is **not a duplicate-symbol violation** (the rule from
+  cumulative-49-51 / -52-54). What it IS is a parallel pattern with
+  inconsistent probe dtypes: C2's helper allocates `np.float16`
+  (matching the FP16 backbones); C3's helper allocates `np.float32`.
+  AZ-346's ALIKED build config uses `PrecisionMode.FP16` — if a TRT
+  10.x engine binding rejects an FP32 tensor on input, the probe will
+  fail at startup with a confusing dtype error rather than the
+  intended schema check.
+
+- **Suggestion**: either (a) thread the engine's input dtype through
+  `assert_keypoint_engine_output_schema` and synthesise the probe in
+  the matching dtype, OR (b) hard-code the probe to `np.float16` and
+  document the FP16-backbones-only assumption. No fix required this
+  cycle — the existing tests cover the schema-mismatch path with a
+  fake runtime that accepts any dtype. Re-evaluate when AZ-321 lands
+  a real TRT-compiled ALIKED engine and the IT-level integration
+  surfaces the actual binding behaviour. Sized at <1 point.
+
+- **Why Low**: no current runtime exercises the FP16-binding path
+  end-to-end; the failure mode is "confusing startup error" not
+  "silent miscalculation". A future cumulative review may re-classify
+  if a TRT-level test surfaces the issue.
+
+### F2: XFeat imports underscore-prefixed helpers from `_pipeline.py` (Low / Style)
+
+- **Location**: `src/gps_denied_onboard/components/c3_matcher/xfeat.py:33-40`
+- **Description**: Recorded in the B57 per-batch review (F1). Carried
+  here so the next cumulative review sees the carry-over instead of
+  re-discovering. Convention-only: the four helpers
+  (`_emit_backbone_error`, `_emit_frame_done`, `_emit_insufficient_inliers`,
+  `_fail_all`) are component-internal but cross-file-public. Either
+  (a) drop the leading underscore, OR (b) document the
+  module-internal-but-cross-file convention in `_pipeline.py`'s
+  docstring. Both are sub-1-point changes.
+
+### F3: AZ-347 AC-special-2 latency benchmark not tested (Low / Scope)
+
+- **Location**: `_docs/02_tasks/done/AZ-347_c3_xfeat.md:91-94`
+- **Description**: AZ-347 explicitly marks AC-special-2 as
+  "informational metric; if XFeat is NOT faster, that's a backbone
+  misconfiguration, not a contract violation. Documented in the test
+  report; does NOT block this AC." Recorded in B57 per-batch review.
+  Carry-forward only for traceability; real benchmark belongs to
+  C3-PT-01 (E-BBT, deferred).
+
+## Phase Summary
+
+### Phase 1 — Context Loading
+
+- 6 task specs reviewed (AZ-528, AZ-335, AZ-345, AZ-346, AZ-347,
+  AZ-349) from `_docs/02_tasks/done/`.
+- Doc inputs: `_docs/02_document/architecture.md`,
+  `_docs/02_document/module-layout.md`, the 3 per-batch reviews
+  (`batch_55_review.md` / `batch_56_review.md` / `batch_57_review.md`),
+  the prior cumulative review (`cumulative_review_batches_52-54_cycle1_report.md`),
+  and the `c1_vio`, `c3_matcher`, `c3_5_adhop` component
+  `description.md` files.
+- No `_docs/02_document/architecture_compliance_baseline.md` exists;
+  the **Baseline Delta** section is omitted (consistent with all
+  prior cumulative reviews).
+
+### Phase 2 — Spec Compliance
+
+Per-batch reviews covered AC-by-AC compliance:
+
+- **B55 (AZ-528)**: 7/7 ACs + AST guard + 2 regression guards;
+  full c1_vio test surface re-verified unchanged.
+- **B56 (AZ-335)**: 10/10 ACs + 3 NFRs; warm-start save/load atomic
+  contracts exercised on real `tmp_path` via `Sha256Sidecar`.
+- **B57 (AZ-345/346/347/349)**: 47/47 ACs (12 + 14 + 13 + 11) via
+  the parametrised C3 test file (43 tests) + the AdHoP test file (23
+  tests). AC-special-2 (AZ-347) is informational per spec.
+
+Cross-batch checks:
+
+- AZ-335's `WarmStartHintStore` Protocol + `JsonSidecarWarmStartHintStore`
+  implementation are consumed only by `runtime_root/warm_start_wiring.py`
+  — no downstream consumer touched in B57.
+- AZ-528's `_facade_spine.py` is consumed by all three c1_vio
+  strategies; AZ-335's `WarmStartWiredStrategy` wraps any concrete
+  `VioStrategy` without needing knowledge of the spine. No coupling
+  drift.
+- AZ-345/346/347 all consume `LightGlueRuntime` + `RansacFilter` +
+  `RollingHealthWindow` by **identity** through the AZ-344 factory.
+  Tests assert identity-sharing for AZ-345 / AZ-346; AZ-347 omits
+  by design (AC-special-1).
+- AZ-349 (AdHoP) shares the `RansacFilter` instance with C3 + C4
+  per the contract; the B57 per-batch test (`test_ac10_factory_wires_adhop_strategy`)
+  asserts `instance._ransac_filter is ransac`.
+
+### Phase 3 — Code Quality
+
+- SOLID: `_pipeline.run_lightglue_pipeline` cleanly separates the
+  per-frame orchestration (one responsibility — drive the
+  drop-and-continue loop) from the per-backbone `extract_features`
+  closure. `AdHoPRefiner.refine_if_needed` is large (~80 lines) but
+  the conditional gate, invoked-path, and fall-through all sit on
+  the same level of abstraction — splitting would obscure
+  Invariant 4 rather than clarify it.
+- Error handling: `RefinerBackboneError` is the documented
+  conversion target for `RuntimeError` / `ValueError` from
+  `inference_runtime.infer`; `MemoryError` re-raises (AC-5
+  closed-set). Test coverage explicit on both paths.
+- No bare `except` / `pass`. No new `# noqa` markers introduced.
+
+### Phase 4 — Security Quick-Scan
+
+- No SQL, no shell-exec, no eval/exec. No hardcoded credentials.
+  C3 / C3.5 receive their model weights via `Path` (validated in
+  `__post_init__`); the path traversal vector is the same as every
+  other component's `Path`-based config field — covered by config
+  validation, not C3-specific.
+
+### Phase 5 — Performance Scan
+
+Per-batch reviews already covered the latency budgets:
+
+- B55: AZ-528 perf-neutral by design (verified by parity tests).
+- B56: warm-start save NFR p99 ≤ 50 ms on Tier-2 NVMe; smoke
+  exercises on `tmp_path` budget the same threshold.
+- B57: C3 strategies & AdHoP exercised with test doubles only —
+  real-hardware p95 budgets deferred to C3-PT-01 / C3.5-PT-01.
+
+### Phase 6 — Cross-Task Consistency
+
+**FDR record-kind namespace.** B57 registered 5 new kinds:
+
+- `matcher.frame_done`, `matcher.backbone_error`,
+  `matcher.insufficient_inliers`, `matcher.all_failed`
+- `refiner.frame_done`
+
+All five follow the established `<component>.<event>` convention
+already in use by `vio.warm_start`, `c2_5_rerank.frame_done`, etc.
+Round-trip fixtures added to
+`tests/unit/test_az272_fdr_record_schema.py` so future schema
+extensions cannot silently break the new kinds.
+
+**Strategy-label namespace.** B57 registered three new labels in
+`config.components['c3_matcher'].strategy`: `"disk_lightglue"`,
+`"aliked_lightglue"`, `"xfeat"`. These match the
+`matcher_label` field in `MatchResult` (AZ-348 AC-2) one-for-one;
+no drift between config label and per-frame telemetry label.
+
+**Refinement-label namespace.** AZ-349 emits `refinement_label="adhop"`
+on success; `"passthrough"` on gate-decided OR fall-through. The
+`KNOWN_STRATEGIES` enum in `c3_5_adhop/config.py` matches.
+
+### Phase 7 — Architecture Compliance
+
+**Layer direction.** All B55-57 source imports traced:
+
+- `c1_vio` → `_types`, `helpers`, `fdr_client`, `clock`, `config`
+  (L1) + sibling `c1_vio/*` (L3 internal). No cross-component
+  source imports.
+- `c3_matcher` → `_types`, `helpers`, `fdr_client`, `clock`, `config`
+  (L1) + sibling `c3_matcher/*` (L3 internal) + `c3_matcher/inference_runtime_cut`
+  (consumer-side cut over C7 per AZ-507). No direct
+  `c7_inference.*` source imports.
+- `c3_5_adhop` → same shape as `c3_matcher`; consumer-side cut
+  `c3_5_adhop/inference_runtime_cut`.
+- `runtime_root` → composition root; allowed to import any
+  component's internals. Imports `c1_vio/_facade_spine`,
+  `c1_vio/warm_start_store`, `c3_matcher/{disk_lightglue, aliked_lightglue, xfeat}`,
+  `c3_5_adhop/adhop_refiner`, `c3_5_adhop/passthrough_refiner`.
+
+No layer-direction violations.
+
+**Public API respect.** All cross-component imports go through the
+Public API surface. The `runtime_root` → component-internals
+imports are allowed by `module-layout.md` § 6 + § 9.
+
+**No new cyclic dependencies.** Module-level import graph diffed
+against batches 52-54: no new cycles introduced. The C7
+`InferenceRuntime` is consumed in C3 + C3.5 through their own
+consumer-side `InferenceRuntimeCut` Protocols, so the C7 ↔ C3 and
+C7 ↔ C3.5 cuts remain unidirectional (consumer imports the cut, not
+the producer).
+
+**Duplicate symbols across components.** Scanned the changed files
+for duplicate class / function / constant names. F1 (above) is the
+only parallel pattern across components and is correctly recorded
+as "two helpers with genuinely different signatures, not a
+duplicate".
+
+**Cross-cutting concerns.** `iso_ts_from_clock` is the shared
+ISO-8601 timestamp helper (`helpers/iso_timestamps.py`, AZ-526).
+B55-57 source uses it correctly (AZ-528's `_facade_spine.py`
+re-exports `now_iso` which itself calls `iso_ts_from_clock`).
+`Sha256Sidecar` (AZ-280) is the shared atomic-write helper —
+warm-start uses it. `RansacFilter` (AZ-282) and `LightGlueRuntime`
+(AZ-285) are identity-shared across C3 + C3.5 + C4. No
+cross-cutting concern locally re-implemented.
+
+## Verdict
+
+**PASS_WITH_WARNINGS** — three Low findings, all carried-forward or
+non-blocking style observations. No Critical / High / Architecture
+findings. The cumulative-review-52-54 Medium F1 finding (c1_vio
+facade-spine duplication) is **closed by AZ-528**, which lands
+cleanly with regression guards.
+
+Auto-fix would not improve any of the Low findings without
+expanding scope; the C3-PT-01 / C3.5-PT-01 / AZ-321 ALIKED-engine
+land will surface F1 (dtype) and F3 (latency) under real conditions.
+F2 is style-only.
+
+Proceed to next batch.
diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md
index 66b24f6..8020b07 100644
--- a/_docs/_autodev_state.md
+++ b/_docs/_autodev_state.md
@@ -12,6 +12,6 @@ sub_step:
 retry_count: 0
 cycle: 1
 tracker: jira
-last_completed_batch: 56
-last_cumulative_review: batches_52-54
-current_batch: 57
+last_completed_batch: 57
+last_cumulative_review: batches_55-57
+current_batch: 58