From 360aece7a6b782788963318a047f0cbf5a6b3cd6 Mon Sep 17 00:00:00 2001 From: Oleksandr Bezdieniezhnykh Date: Thu, 14 May 2026 04:12:47 +0300 Subject: [PATCH] [AZ-528] [AZ-335] [AZ-345..AZ-347] [AZ-349] Cumulative review 55-57 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cumulative code review for the C3 / C3.5 cross-domain matching pipeline going live (B55 facade-spine consolidation, B56 warm-start + F8 reboot recovery, B57 three concrete matchers + AdHoP refiner). Verdict PASS_WITH_WARNINGS — three Low findings, no Critical / High / Architecture issues. Cumulative-52-54 Medium F1 (c1_vio facade-spine duplication) closed by AZ-528 with regression guards. State: last_completed_batch=57, last_cumulative_review=batches_55-57, current_batch=58. Co-authored-by: Cursor --- ...tive_review_batches_55-57_cycle1_report.md | 265 ++++++++++++++++++ _docs/_autodev_state.md | 6 +- 2 files changed, 268 insertions(+), 3 deletions(-) create mode 100644 _docs/03_implementation/cumulative_review_batches_55-57_cycle1_report.md diff --git a/_docs/03_implementation/cumulative_review_batches_55-57_cycle1_report.md b/_docs/03_implementation/cumulative_review_batches_55-57_cycle1_report.md new file mode 100644 index 0000000..d7289f2 --- /dev/null +++ b/_docs/03_implementation/cumulative_review_batches_55-57_cycle1_report.md @@ -0,0 +1,265 @@ +# Cumulative Code Review — Batches 55-57 (Cycle 1) + +**Date**: 2026-05-14 +**Range**: batches 55 (AZ-528 — `c1_vio` facade-spine consolidation), 56 (AZ-335 — C1 warm-start + F8 reboot recovery), 57 (AZ-345/346/347/349 — C3 concrete matchers + C3.5 AdHoP refiner) +**Compared against**: previous cumulative review batches 52-54 +**Verdict**: **PASS_WITH_WARNINGS** + +## Scope + +40 files changed across batches 55-57 (`git diff --stat 21cef8b..HEAD`): + +- 21 src/ files (10 new, 11 modified) across `c1_vio`, `c3_matcher`, + `c3_5_adhop`, `runtime_root`, `fdr_client`. +- 5 test files (3 new, 2 modified). +- 6 doc files (3 batch reports + 3 review reports). +- 6 _docs / housekeeping (state file, dependencies table, task moves). + +The trio is **the C-3/C-3.5 cross-domain matching pipeline going live**: +- B55 closed the cumulative-review-49-51 follow-up (3-way duplication + in c1_vio strategy facades — AZ-528). +- B56 added warm-start hint persistence + F8 reboot recovery wiring, + which both producers (any concrete `VioStrategy`) and consumers + (operator orchestrator + composition root) now exercise. +- B57 made C3 (3 strategies) and C3.5 (production-default refiner) + runnable end-to-end against real engine handles; the AZ-344 + + AZ-348 factories now have real strategy targets for every + configured `config.matcher.strategy` and `config.refiner.strategy` + value. + +## Carry-over closure from cumulative review 52-54 + +| Prior finding | Status | Notes | +|---------------|--------|-------| +| F1 (Medium) — c1_vio strategy facade spine duplicated 3-way | **CLOSED by AZ-528 / Batch 55** | `_facade_spine.py` lands the shared helpers + `FacadeSpine` mixin; `okvis2.py`, `vins_mono.py`, `klt_ransac.py` now import from the spine. Regression guards added in `tests/unit/c1_vio/test_az528_facade_spine.py` (AST walk + import grep + behaviour parity). | +| F2 (Low) — c1_vio test fakes not yet shared | **OPEN — carry forward** | No code touched this in B55-57. Still no active drift; KLT/RANSAC and the OKVIS2+VINS-Mono pair use genuinely different abstractions (monkeypatch vs fake backend). Continues to be a 1-2 point future hygiene pass. | + +## Findings (this window) + +| # | Severity | Category | File:Line | Title | +|---|----------|-----------------|-----------|-------| +| F1 | Low | Maintainability | `c3_matcher/_engine_output_assertion.py:60` vs `c2_vpr/_engine_dim_assertion.py:86` | Two parallel engine-output-probe helpers with different probe dtypes (FP32 vs FP16) — pattern is consistent but the probe-dtype divergence creates a footgun for future keypoint-backbone work | +| F2 | Low | Style | `c3_matcher/xfeat.py:33-40` | `xfeat.py` imports four underscore-prefixed symbols from sibling `_pipeline.py` | +| F3 | Low | Scope | AZ-347 AC-special-2 | Latency comparison (XFeat < DISK+LightGlue p95) flagged in B57 per-batch review; carried forward to keep cross-batch traceability | + +## Finding Details + +### F1: Two parallel engine-output-probe helpers (Low / Maintainability) + +- **Locations**: + - `src/gps_denied_onboard/components/c2_vpr/_engine_dim_assertion.py:86` + (introduced by AZ-527 / batch 52) + - `src/gps_denied_onboard/components/c3_matcher/_engine_output_assertion.py:60` + (introduced by AZ-346 / batch 57) + +- **Description**: Both helpers are dry-run engine probes invoked at + strategy `create()` time. C2 VPR backbones emit a single + descriptor tensor at `output_key`; C3 keypoint backbones emit + `keypoints` + `descriptors`. The two contracts genuinely differ — + this is **not a duplicate-symbol violation** (the rule from + cumulative-49-51 / -52-54). What it IS is a parallel pattern with + inconsistent probe dtypes: C2's helper allocates `np.float16` + (matching the FP16 backbones); C3's helper allocates `np.float32`. + AZ-346's ALIKED build config uses `PrecisionMode.FP16` — if a TRT + 10.x engine binding rejects an FP32 tensor on input, the probe will + fail at startup with a confusing dtype error rather than the + intended schema check. + +- **Suggestion**: either (a) thread the engine's input dtype through + `assert_keypoint_engine_output_schema` and synthesise the probe in + the matching dtype, OR (b) hard-code the probe to `np.float16` and + document the FP16-backbones-only assumption. No fix required this + cycle — the existing tests cover the schema-mismatch path with a + fake runtime that accepts any dtype. Re-evaluate when AZ-321 lands + a real TRT-compiled ALIKED engine and the IT-level integration + surfaces the actual binding behaviour. Sized at <1 point. + +- **Why Low**: no current runtime exercises the FP16-binding path + end-to-end; the failure mode is "confusing startup error" not + "silent miscalculation". A future cumulative review may re-classify + if a TRT-level test surfaces the issue. + +### F2: XFeat imports underscore-prefixed helpers from `_pipeline.py` (Low / Style) + +- **Location**: `src/gps_denied_onboard/components/c3_matcher/xfeat.py:33-40` +- **Description**: Recorded in the B57 per-batch review (F1). Carried + here so the next cumulative review sees the carry-over instead of + re-discovering. Convention-only: the four helpers + (`_emit_backbone_error`, `_emit_frame_done`, `_emit_insufficient_inliers`, + `_fail_all`) are component-internal but cross-file-public. Either + (a) drop the leading underscore, OR (b) document the + module-internal-but-cross-file convention in `_pipeline.py`'s + docstring. Both are sub-1-point changes. + +### F3: AZ-347 AC-special-2 latency benchmark not tested (Low / Scope) + +- **Location**: `_docs/02_tasks/done/AZ-347_c3_xfeat.md:91-94` +- **Description**: AZ-347 explicitly marks AC-special-2 as + "informational metric; if XFeat is NOT faster, that's a backbone + misconfiguration, not a contract violation. Documented in the test + report; does NOT block this AC." Recorded in B57 per-batch review. + Carry-forward only for traceability; real benchmark belongs to + C3-PT-01 (E-BBT, deferred). + +## Phase Summary + +### Phase 1 — Context Loading + +- 6 task specs reviewed (AZ-528, AZ-335, AZ-345, AZ-346, AZ-347, + AZ-349) from `_docs/02_tasks/done/`. +- Doc inputs: `_docs/02_document/architecture.md`, + `_docs/02_document/module-layout.md`, the 3 per-batch reviews + (`batch_55_review.md` / `batch_56_review.md` / `batch_57_review.md`), + the prior cumulative review (`cumulative_review_batches_52-54_cycle1_report.md`), + and the `c1_vio`, `c3_matcher`, `c3_5_adhop` component + `description.md` files. +- No `_docs/02_document/architecture_compliance_baseline.md` exists; + the **Baseline Delta** section is omitted (consistent with all + prior cumulative reviews). + +### Phase 2 — Spec Compliance + +Per-batch reviews covered AC-by-AC compliance: + +- **B55 (AZ-528)**: 7/7 ACs + AST guard + 2 regression guards; + full c1_vio test surface re-verified unchanged. +- **B56 (AZ-335)**: 10/10 ACs + 3 NFRs; warm-start save/load atomic + contracts exercised on real `tmp_path` via `Sha256Sidecar`. +- **B57 (AZ-345/346/347/349)**: 47/47 ACs (12 + 14 + 13 + 11) via + the parametrised C3 test file (43 tests) + the AdHoP test file (23 + tests). AC-special-2 (AZ-347) is informational per spec. + +Cross-batch checks: + +- AZ-335's `WarmStartHintStore` Protocol + `JsonSidecarWarmStartHintStore` + implementation are consumed only by `runtime_root/warm_start_wiring.py` + — no downstream consumer touched in B57. +- AZ-528's `_facade_spine.py` is consumed by all three c1_vio + strategies; AZ-335's `WarmStartWiredStrategy` wraps any concrete + `VioStrategy` without needing knowledge of the spine. No coupling + drift. +- AZ-345/346/347 all consume `LightGlueRuntime` + `RansacFilter` + + `RollingHealthWindow` by **identity** through the AZ-344 factory. + Tests assert identity-sharing for AZ-345 / AZ-346; AZ-347 omits + by design (AC-special-1). +- AZ-349 (AdHoP) shares the `RansacFilter` instance with C3 + C4 + per the contract; the B57 per-batch test (`test_ac10_factory_wires_adhop_strategy`) + asserts `instance._ransac_filter is ransac`. + +### Phase 3 — Code Quality + +- SOLID: `_pipeline.run_lightglue_pipeline` cleanly separates the + per-frame orchestration (one responsibility — drive the + drop-and-continue loop) from the per-backbone `extract_features` + closure. `AdHoPRefiner.refine_if_needed` is large (~80 lines) but + the conditional gate, invoked-path, and fall-through all sit on + the same level of abstraction — splitting would obscure + Invariant 4 rather than clarify it. +- Error handling: `RefinerBackboneError` is the documented + conversion target for `RuntimeError` / `ValueError` from + `inference_runtime.infer`; `MemoryError` re-raises (AC-5 + closed-set). Test coverage explicit on both paths. +- No bare `except` / `pass`. No new `# noqa` markers introduced. + +### Phase 4 — Security Quick-Scan + +- No SQL, no shell-exec, no eval/exec. No hardcoded credentials. + C3 / C3.5 receive their model weights via `Path` (validated in + `__post_init__`); the path traversal vector is the same as every + other component's `Path`-based config field — covered by config + validation, not C3-specific. + +### Phase 5 — Performance Scan + +Per-batch reviews already covered the latency budgets: + +- B55: AZ-528 perf-neutral by design (verified by parity tests). +- B56: warm-start save NFR p99 ≤ 50 ms on Tier-2 NVMe; smoke + exercises on `tmp_path` budget the same threshold. +- B57: C3 strategies & AdHoP exercised with test doubles only — + real-hardware p95 budgets deferred to C3-PT-01 / C3.5-PT-01. + +### Phase 6 — Cross-Task Consistency + +**FDR record-kind namespace.** B57 registered 5 new kinds: + +- `matcher.frame_done`, `matcher.backbone_error`, + `matcher.insufficient_inliers`, `matcher.all_failed` +- `refiner.frame_done` + +All five follow the established `.` convention +already in use by `vio.warm_start`, `c2_5_rerank.frame_done`, etc. +Round-trip fixtures added to +`tests/unit/test_az272_fdr_record_schema.py` so future schema +extensions cannot silently break the new kinds. + +**Strategy-label namespace.** B57 registered three new labels in +`config.components['c3_matcher'].strategy`: `"disk_lightglue"`, +`"aliked_lightglue"`, `"xfeat"`. These match the +`matcher_label` field in `MatchResult` (AZ-348 AC-2) one-for-one; +no drift between config label and per-frame telemetry label. + +**Refinement-label namespace.** AZ-349 emits `refinement_label="adhop"` +on success; `"passthrough"` on gate-decided OR fall-through. The +`KNOWN_STRATEGIES` enum in `c3_5_adhop/config.py` matches. + +### Phase 7 — Architecture Compliance + +**Layer direction.** All B55-57 source imports traced: + +- `c1_vio` → `_types`, `helpers`, `fdr_client`, `clock`, `config` + (L1) + sibling `c1_vio/*` (L3 internal). No cross-component + source imports. +- `c3_matcher` → `_types`, `helpers`, `fdr_client`, `clock`, `config` + (L1) + sibling `c3_matcher/*` (L3 internal) + `c3_matcher/inference_runtime_cut` + (consumer-side cut over C7 per AZ-507). No direct + `c7_inference.*` source imports. +- `c3_5_adhop` → same shape as `c3_matcher`; consumer-side cut + `c3_5_adhop/inference_runtime_cut`. +- `runtime_root` → composition root; allowed to import any + component's internals. Imports `c1_vio/_facade_spine`, + `c1_vio/warm_start_store`, `c3_matcher/{disk_lightglue, aliked_lightglue, xfeat}`, + `c3_5_adhop/adhop_refiner`, `c3_5_adhop/passthrough_refiner`. + +No layer-direction violations. + +**Public API respect.** All cross-component imports go through the +Public API surface. The `runtime_root` → component-internals +imports are allowed by `module-layout.md` § 6 + § 9. + +**No new cyclic dependencies.** Module-level import graph diffed +against batches 52-54: no new cycles introduced. The C7 +`InferenceRuntime` is consumed in C3 + C3.5 through their own +consumer-side `InferenceRuntimeCut` Protocols, so the C7 ↔ C3 and +C7 ↔ C3.5 cuts remain unidirectional (consumer imports the cut, not +the producer). + +**Duplicate symbols across components.** Scanned the changed files +for duplicate class / function / constant names. F1 (above) is the +only parallel pattern across components and is correctly recorded +as "two helpers with genuinely different signatures, not a +duplicate". + +**Cross-cutting concerns.** `iso_ts_from_clock` is the shared +ISO-8601 timestamp helper (`helpers/iso_timestamps.py`, AZ-526). +B55-57 source uses it correctly (AZ-528's `_facade_spine.py` +re-exports `now_iso` which itself calls `iso_ts_from_clock`). +`Sha256Sidecar` (AZ-280) is the shared atomic-write helper — +warm-start uses it. `RansacFilter` (AZ-282) and `LightGlueRuntime` +(AZ-285) are identity-shared across C3 + C3.5 + C4. No +cross-cutting concern locally re-implemented. + +## Verdict + +**PASS_WITH_WARNINGS** — three Low findings, all carried-forward or +non-blocking style observations. No Critical / High / Architecture +findings. The cumulative-review-52-54 Medium F1 finding (c1_vio +facade-spine duplication) is **closed by AZ-528**, which lands +cleanly with regression guards. + +Auto-fix would not improve any of the Low findings without +expanding scope; the C3-PT-01 / C3.5-PT-01 / AZ-321 ALIKED-engine +land will surface F1 (dtype) and F3 (latency) under real conditions. +F2 is style-only. + +Proceed to next batch. diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index 66b24f6..8020b07 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -12,6 +12,6 @@ sub_step: retry_count: 0 cycle: 1 tracker: jira -last_completed_batch: 56 -last_cumulative_review: batches_52-54 -current_batch: 57 +last_completed_batch: 57 +last_cumulative_review: batches_55-57 +current_batch: 58