Files
gps-denied-onboard/_docs/03_implementation/reviews/batch_58_review.md
T
Oleksandr Bezdieniezhnykh 4eac24f37a [AZ-358] [AZ-361] C4 OpenCVGtsamPoseEstimator + Jacobian thermal hybrid
Implement the single production-default C4 PoseEstimator strategy.

AZ-358 — Marginals path: OpenCV solvePnPRansac (SOLVEPNP_IPPE) on
best-candidate inliers, PriorFactorPose3 with Jacobian-derived initial
covariance, flushed into C5's iSAM2 graph via the widened
ISam2GraphHandle.update(graph, values, None) (Option B). Posterior
covariance from compute_marginals().marginalCovariance(pose_key) with
SPD-defensive Cholesky check. Tile pixel -> ENU world conversion via
the shared WgsConverter + a configurable tile_size_px. Two spec
deviations now documented in the AZ-358 task file: PriorFactorPose3
over GenericProjectionFactorCal3DS2 (avoids unbounded landmark
variables; same Fisher information on the pose marginal) and explicit
(graph, values, timestamps) update args (aligns with C5's impl).

AZ-361 — Jacobian + thermal hybrid: per-frame dispatch on
thermal_state.thermal_throttle_active selects the cv2.projectPoints-
derived 6x6 information matrix (with ridge regularisation) as the
emitted covariance. Skips the iSAM2 factor add under throttle
(Invariant 12). Emits CovarianceDegradedWarning via warnings.warn
(never raised); paired WARN log + FDR record rate-limited per
covariance_degraded_warn_window_ns (default 60 s) via an injected
monotonic Clock. Supersedes the AZ-358 NotImplementedError stub.

Widens ISam2GraphHandle from get_pose_key only to all five C4-facing
methods (add_factor, update, compute_marginals, last_anchor_age_ms);
C5's existing ISam2GraphHandleImpl already satisfies the superset, so
no C5 source change this batch. Threads fdr_client + clock through
pose_factory composition.

Registers two new FDR payload kinds: pose.frame_done (per-call
telemetry; both success and PnpFailureError paths) and
pose.covariance_degraded (per-window throttle exposure).

Tests: 21 new (AZ-358 AC-1..11 + AZ-361 AC-1..10/12/13; AZ-361 AC-11
RMSE-ratio informational per spec, not asserted). Updates 2 existing
test files for Protocol widening and the FDR-schema round trip.

Code review verdict: PASS_WITH_WARNINGS (5 findings: Medium x2,
Low x3; none blocking). Full suite: 1958 passed, 1 unrelated
host-dependent perf failure (c12 CLI cold-start, pre-existing).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 05:01:14 +03:00

12 KiB
Raw Blame History

Code Review Report

Batch: 58 — AZ-358 (C4 OpenCVGtsamPoseEstimator — Marginals) + AZ-361 (C4 Jacobian + thermal hybrid) Date: 2026-05-14 Verdict: PASS_WITH_WARNINGS

Scope

Changed files reviewed:

  • src/gps_denied_onboard/components/c4_pose/opencv_gtsam_estimator.py (new, ~970 LOC)
  • src/gps_denied_onboard/components/c4_pose/_isam2_handle.py (Protocol widened 1 → 5 methods)
  • src/gps_denied_onboard/components/c4_pose/config.py (3 new fields + validation)
  • src/gps_denied_onboard/fdr_client/records.py (2 new payload kinds)
  • src/gps_denied_onboard/runtime_root/pose_factory.py (threads fdr_client + clock)
  • tests/unit/c4_pose/test_az358_361_opencv_gtsam_estimator.py (new, 21 tests)
  • tests/unit/c4_pose/test_az355_pose_protocol.py (Protocol fixture widened; lazy-import test corrected)
  • tests/unit/test_az272_fdr_record_schema.py (payload fixture extended)
  • _docs/02_tasks/todo/AZ-358_c4_opencv_gtsam_marginals.md (Option B + PriorFactorPose3 deviation documented)

Spec Compliance

AZ-358 (Marginals path)

AC Verified by Status
AC-1 PnP success on synthetic correspondences test_az358_ac1_pnp_success_synthetic_within_tolerance PASS
AC-2 Degenerate inliers → PnpFailureError test_az358_ac2_insufficient_inliers_raises_pnp_failure PASS
AC-3 SPD covariance test_az358_ac3_4_5_marginals_success_invariants PASS
AC-4 covariance_mode == MARGINALS on success same PASS
AC-5 source_label == SATELLITE_ANCHORED same PASS
AC-6 WGS84 conversion uses shared WgsConverter test_az358_ac6_wgs_converter_injection PASS
AC-7 iSAM2 handle call sequence (no add_factor; 1× each of get_pose_key, update, compute_marginals) test_az358_ac7_isam2_handle_call_sequence + test_az358_update_carries_prior_factor_in_local_graph PASS
AC-8 (replaced by AZ-361 — throttle now runs Jacobian path, not NotImplementedError) test_az358_ac8_replacement_throttle_now_runs_jacobian_path PASS — deviation documented in the AZ-358 spec
AC-9 Non-SPD covariance defensive raise test_az358_ac9_non_spd_covariance_raises_pnp_failure PASS
AC-10 Composition-root wiring (INFO log + identity-shared deps) test_az358_ac10_composition_root_wiring_emits_ready_log + test_az358_ac10_create_module_factory_returns_protocol_instance PASS
AC-11 FDR pose.frame_done record shape test_az358_ac11_fdr_pose_frame_done_shape_on_success PASS

Documented deviations from the original spec wording (now mirrored in the updated spec):

  1. PriorFactorPose3 instead of GenericProjectionFactorCal3DS2 — mathematically equivalent on the pose marginal, avoids unbounded landmark variables, no new georef infra needed.
  2. handle.update(graph, values, None) instead of no-arg update() — Option B (C4 builds the per-call diff itself; aligns with C5's ISam2GraphHandleImpl).
  3. AC-8 supplanted by AZ-361 — no NotImplementedError placeholder is shipped.

AZ-361 (Jacobian + thermal hybrid)

AC Verified by Status
AC-1 Per-frame mode dispatch test_az361_ac1_2_4_per_frame_mode_dispatch_alternates PASS
AC-2 Mode-switch latency ≤ 1 frame same PASS
AC-3 Jacobian covariance SPD test_az361_ac3_5_jacobian_success_invariants PASS
AC-4 covariance_mode == JACOBIAN same + AC-1 test PASS
AC-5 Source label SATELLITE_ANCHORED both paths AC-3 + AC-5 tests PASS
AC-6 CovarianceDegradedWarning via warnings.warn (not raised) test_az361_ac6_warning_emitted_via_warnings_warn PASS
AC-7 Warning rate-limited per 60 s window test_az361_ac7_8_warning_and_log_rate_limited PASS
AC-8 WARN log rate-limited similarly same PASS
AC-9 Marginals path unchanged by refactor full AZ-358 AC suite still passes PASS
AC-10 Near-singular Jacobian → defensive raise test_az361_ac10_near_singular_jacobian_raises_pnp_failure PASS
AC-11 Jacobian within ~510 % of Marginals (informational) not tested — see F5 PARTIAL (spec marks informational)
AC-12 Jacobian path skips iSAM2 factor add test_az361_ac12_jacobian_path_skips_isam2_update PASS
AC-13 FDR mode field distinguishes path test_az361_ac13_fdr_mode_field_distinguishes_path PASS

Contract verification

_docs/02_document/contracts/c4_pose/pose_estimator_protocol.md v1.0.0:

  • Public PoseEstimator.estimate signature matches the implementation's estimate(match_result, calibration, thermal_state) -> PoseEstimate.
  • Invariant 1 (single-threaded) — not enforced structurally (Python GIL + composition-root contract); documented.
  • Invariant 5 (SPD covariance) — runtime np.linalg.cholesky check on both paths; on failure raises PnpFailureError.
  • Invariant 6 (covariance_mode matches path actually taken) — set per-path before assemble.
  • Invariant 7 (source_label always SATELLITE_ANCHORED on success) — hard-coded in _assemble_pose_estimate.
  • Invariant 8 (last_satellite_anchor_age_ms sourced from C5) — comes via handle.last_anchor_age_ms(); C4 never computes it.
  • Invariant 9 (only PnpFailureError escapes) — verified by error-path tests.

The ISam2GraphHandle Protocol stub at c4_pose/_isam2_handle.py is widened to five methods in lockstep with C5's superset implementation (per ADR-003). C5 is documented to already provide the superset, so no C5-side source change is required this batch.

Findings

# Severity Category File:Line Title
1 Medium Maintainability opencv_gtsam_estimator.py:254-345,381-435 Duplicated FDR-error + ERROR-log + raise blocks across Marginals and Jacobian paths
2 Medium Performance opencv_gtsam_estimator.py:617-657 _tile_pixels_to_enu_world runs a Python for loop with per-point pyproj Transformer calls
3 Low Style opencv_gtsam_estimator.py:289-296 Bare except Exception: pass for GTSAM duplicate-key insert; could narrow to RuntimeError or pre-check via Values.exists(key)
4 Low Performance opencv_gtsam_estimator.py:535-540, 575-586 cv2.projectPoints called twice on the Marginals path (residuals + Jacobian); a single call returning both outputs would halve that cost
5 Low Spec-Gap tests/unit/c4_pose/test_az358_361_opencv_gtsam_estimator.py AZ-361 AC-11 (Jacobian RMSE within 1.10× Marginals on synthetic baseline) is not directly asserted; spec explicitly tags this AC informational

Finding details

F1: Duplicated FDR-error + ERROR-log + raise blocks (Medium / Maintainability)

  • Location: opencv_gtsam_estimator.py — three near-identical 10-line blocks inside _estimate_marginals_path (PnP-fail, marginals-fail, SPD-fail) and two more inside _estimate_jacobian_path (PnP-fail, SPD-fail). All five do: _emit_pose_frame_done_fdr_error(...)self._log.error(_LOG_KIND_PNP_FAILURE, extra={...})raise PnpFailureError(...).
  • Description: shared error-emit shape; only the stage key + the wrapping exception message change.
  • Suggestion: extract a private _fail(frame_id, mode, stage, exc, message) -> NoReturn helper. Keeps the path methods readable and removes 4 of 5 duplications.
  • Task: AZ-358 + AZ-361. Non-blocking — internal cleanup only.

F2: Per-point pyproj loop in _tile_pixels_to_enu_world (Medium / Performance)

  • Location: opencv_gtsam_estimator.py:617-657.
  • Description: for each inlier, the helper calls WgsConverter.latlonalt_to_local_enu(origin, LatLonAlt(...)), which in turn calls _ECEF_FROM_LLA.transform(...) once. On a 50-inlier frame this is 50 individual pyproj boundary crossings (~ms each). Pose-path p95 target is 90 ms for the whole Marginals path; this leaves little headroom.
  • Suggestion: collect (lon, lat, alt) arrays once, call _ECEF_FROM_LLA.transform(lons, lats, alts) in vectorized form, then apply the (single) ENU rotation in NumPy. Behavior-preserving.
  • Task: AZ-358. Non-blocking for AC compliance; meaningful for NFT-LIM-04 / C4-PT-01 latency margin.

F3: Bare except Exception on local_values.insert (Low / Style)

  • Location: opencv_gtsam_estimator.py:289-296.
  • Description: catches all exceptions to swallow GTSAM's duplicate-insert error. A real bug elsewhere (wrong type, wrong key) would be silently masked.
  • Suggestion: replace with if not local_values.exists(pose_key): local_values.insert(...). Removes the swallow entirely.
  • Task: AZ-358. Non-blocking.

F4: Two cv2.projectPoints calls per Marginals frame (Low / Performance)

  • Location: opencv_gtsam_estimator.py:_compute_reprojection_residuals + opencv_gtsam_estimator.py:_jacobian_covariance.
  • Description: cv2.projectPoints is called once to compute residuals (no Jacobian requested) and once more to compute the Jacobian (no residual reuse). On the Marginals path both fire per frame.
  • Suggestion: in _run_pnp, request Jacobian on the residual call and stash it on _PnpResult; _jacobian_covariance reuses the stashed Jacobian.
  • Task: AZ-358. Non-blocking.

F5: AZ-361 AC-11 (Jacobian-vs-Marginals RMSE) not asserted in tests (Low / Spec-Gap)

  • Location: tests/unit/c4_pose/test_az358_361_opencv_gtsam_estimator.py.
  • Description: the spec marks AC-11 informational and explicitly states it does NOT block. Still, no test exists today that runs both paths on a fixed synthetic and asserts the RMSE ratio.
  • Suggestion: add a single synthetic-baseline test that runs both paths on the same MatchResult and asserts rmse_jacobian / rmse_marginals <= 1.10. Defer to a follow-up batch — does not block this review.
  • Task: AZ-361.

Phase 4 — Security Quick-Scan

No findings. No SQL paths, subprocess usage, hardcoded credentials, network I/O, or unbounded deserialization in the changed surface. Log/FDR payloads carry only telemetry (frame ids, covariance norms, residuals, public coordinates) — no operator PII.

Phase 5 — Performance Scan

  • F2 and F4 above are the only meaningful items.
  • No O(n^2) patterns; no N+1 queries; no blocking I/O.
  • NumPy np.asarray(..., copy=False) and np.ascontiguousarray(...) used appropriately on hot paths.

Phase 6 — Cross-Task Consistency

AZ-358 and AZ-361 land in the same file and share _run_pnp, _jacobian_covariance, _assemble_pose_estimate, _emit_pose_frame_done_fdr, and _emit_pose_frame_done_fdr_error. Both paths emit the same pose.frame_done schema; only AZ-361 additionally emits pose.covariance_degraded. No conflicting patterns.

The ISam2GraphHandle Protocol widening lands in lockstep with both tasks and with the AZ-355 test fixture (_FakeISam2GraphHandle now implements all five methods). A new test (test_ac10_isam2_graph_handle_rejects_partial_surface) was added to the AZ-355 test file to keep the runtime-checkable Protocol from regressing.

Phase 7 — Architecture Compliance

Imports in opencv_gtsam_estimator.py:

  • _types/* (L1) — OK.
  • clock/wall_clock (L1) — OK.
  • components/c4_pose/* (same component, L3) — OK.
  • fdr_client/records (L2 cross-cutting) — OK.
  • helpers/wgs_converter (L2 helpers) — OK.
  • logging (L1) — OK.

No cross-component imports into C5/C6/C7 internals (the only C5 touch-point is the ISam2GraphHandle Protocol owned by c4_pose/_isam2_handle.py). No new module cycles introduced (verified by full-suite run). No duplicated symbols (Quat / LatLonAlt / PoseEstimate are imported, not redefined). Cross-cutting concerns (logging, FDR, WGS conversion, clock) are consumed via shared modules, not re-implemented.

runtime_root/pose_factory.py legitimately imports both the C4 public surface (via __init__) and the private _isam2_handle module — the latter is documented as composition-root-only, which pose_factory.py is.

No Architecture findings.

Verdict

PASS_WITH_WARNINGS — every AC that the spec marks blocking has at least one passing unit test. Five non-blocking findings (Medium ×2, Low ×3) are documented above for follow-up. No Critical or High findings.

Test outcomes

  • tests/unit/c4_pose/ + tests/unit/test_az272_fdr_record_schema.py: 101 passed.
  • Full repo suite: 1958 passed, 1 failed, 84 skipped. The single failure (tests/unit/c12_operator_orchestrator/test_cli_console_script.py::test_cold_start_under_500ms_p99) is a host-dependent CLI cold-start performance assertion (500 ms NFR vs. 700-1100 ms on this macOS dev box) — unrelated to the C4 / FDR surface and pre-existing flake on this hardware.