mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 09:11:13 +00:00
[AZ-381] Fix ISam2GraphHandleImpl missing get_pose_key + comments
F1 (High/Architecture) from cumulative review of batches 01-22: `ISam2GraphHandleImpl` did not satisfy C4's `ISam2GraphHandle` Protocol stub (AZ-355) because it lacked `get_pose_key`. `pose_factory`'s isinstance gate would have raised at composition. Two Protocols (C4 minimal consumer cut, C5 richer producer surface) are intentional per AZ-355 Risk 1 — the impl just needed to expose the canonical name. Delegates to estimator.key_for_frame. Added cross-component conformance test asserting the C5 impl satisfies both Protocols, so future drift trips a unit test. F2 (Medium/Maintainability): added justifying comments at four `except: pass` sites in runtime_root, c8_fc_adapter (ap + inav), and c13_fdr writer. No behavioral change. Updated cumulative review report verdict from FAIL to PASS and recorded a post-mortem on the initial misframing (treated the dual-Protocol design as duplication on first read). Autodev state: batch 22 done, cumulative-review PASS, ready for batch 23. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,157 @@
|
||||
# Cumulative Code Review — Batches 01-22 (Cycle 1)
|
||||
|
||||
**Batches**: 01 through 22 (catch-up review after 7 missed cumulative-review triggers)
|
||||
**Date**: 2026-05-12
|
||||
**Cycle**: 1
|
||||
**Scope**: union of files changed since project start (no prior cumulative review)
|
||||
**Mode**: cumulative (per `code-review/SKILL.md` § Invocation modes)
|
||||
**Architecture baseline**: `_docs/02_document/architecture_compliance_baseline.md` does not exist — no Baseline Delta section emitted.
|
||||
|
||||
**Initial verdict**: **FAIL** (one HIGH-severity Architecture finding affecting cross-task wiring of C4 ↔ C5).
|
||||
**Post-remediation verdict**: **PASS** — F1 and F2 fixed in the same session; F3 is informational.
|
||||
|
||||
## Process Context
|
||||
|
||||
`implement/SKILL.md` Step 14.5 mandates a cumulative cross-batch review every K=3 completed batches. Across batches 01-22, no `cumulative_review_*` artifact exists; this report consolidates 22 batches' worth of accumulated structural drift in one pass. The 7 missed triggers (batches 3, 6, 9, 12, 15, 18, 21) were each individually low-value relative to this single end-to-end pass; the meta-issue is procedural and is captured as F3 below.
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title | Status |
|
||||
|---|----------|-------------|----------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|--------|
|
||||
| 1 | High | Architecture | `src/gps_denied_onboard/components/c5_state/_isam2_handle.py:ISam2GraphHandleImpl` | `ISam2GraphHandleImpl` missing `get_pose_key`, breaks the C4 consumer-side `isinstance` gate | FIXED |
|
||||
| 2 | Medium | Maintainability | 4 sites (see table) | 4 `except: pass` sites in production code without justifying comments | FIXED |
|
||||
| 3 | Low | Process | `_docs/03_implementation/` | 7 cumulative-review triggers (batches 3, 6, 9, 12, 15, 18, 21) were skipped before this run | NOTED |
|
||||
|
||||
### F1: `ISam2GraphHandleImpl` is missing `get_pose_key`, breaking the C4 consumer-side `isinstance` gate (High / Architecture)
|
||||
|
||||
**Design intent (NOT the bug)**
|
||||
|
||||
Two `ISam2GraphHandle` Protocols live in the codebase **intentionally**, and the design is correct:
|
||||
|
||||
| File | Owner | Role | Methods |
|
||||
|-------------------------------------------------------|---------|-------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|
|
||||
| `components/c4_pose/_isam2_handle.py` | C4 | Consumer-side minimal cut — declares only what C4 needs to attach pose factors | `get_pose_key(frame_id: int) -> int` |
|
||||
| `components/c5_state/_isam2_handle.py` | C5 | Producer-side richer surface — declares what C5 (and only C5) calls on its own iSAM2 graph internals | `add_factor`, `update`, `compute_marginals`, `last_anchor_age_ms` |
|
||||
|
||||
Per AZ-355 Risk-1 mitigation: "the stub defines ONLY `get_pose_key(frame_id) -> int` — the minimal surface C4 needs to attach factors. C5 (AZ-260/AZ-381) implements the concrete handle; if C5's graph design grows, the stub may grow but the Protocol surface stays stable as long as C4's needs don't change." Two consumer-side Protocols over a single concrete impl is the intended seam — neither component imports the other's internals.
|
||||
|
||||
**The bug**
|
||||
|
||||
The c4 docstring claims `ISam2GraphHandleImpl` "is strictly a superset" of the C4 surface. That claim was **provably false** before this fix: `ISam2GraphHandleImpl` did not implement `get_pose_key`, so `isinstance(c5_handle, c4.ISam2GraphHandle)` returned **False**.
|
||||
|
||||
Composition impact:
|
||||
|
||||
- `runtime_root/state_factory.py:29` returns `tuple[StateEstimator, ISam2GraphHandle]` where the handle is a `ISam2GraphHandleImpl` instance.
|
||||
- `runtime_root/pose_factory.py:118` runtime-checks `isinstance(isam2_graph_handle, c4.ISam2GraphHandle)` and raises `PoseEstimatorConfigError("...does not satisfy the C4 ISam2GraphHandle Protocol (missing get_pose_key?)")` on failure.
|
||||
- `compose_root(config)` (forthcoming) would have wired `state, handle = build_state_estimator(...); pose = build_pose_estimator(..., isam2_graph_handle=handle)` and tripped that gate.
|
||||
|
||||
**Specification deviation**
|
||||
|
||||
The C5 impl was supposed to expose `get_pose_key` as part of its public consumer surface (AZ-355 Risk 1 + the explicit "strictly a superset" claim in the c4 docstring). The C5 estimator already has the canonical name for the same lookup: `GtsamIsam2StateEstimator.key_for_frame(frame_id)` at `c5_state/gtsam_isam2_estimator.py:406`. The handle simply forgot to surface it.
|
||||
|
||||
**Why nothing has tripped yet**
|
||||
|
||||
- No `compose_root(config)` end-to-end wiring exists yet — `runtime_root/__init__.py` has the takeoff-abort and FDR-open paths but does not yet call both factories in one pass.
|
||||
- AZ-358 (C4 OpenCV/GTSAM concrete impl) is in `todo/` — its tests would have been the first to hit the gate.
|
||||
- The two test files (`test_az355_pose_protocol.py`, `test_az381_state_protocol.py`) each import `ISam2GraphHandle` from their *own* component, so each test passes in isolation; no test exercised cross-component conformance until this remediation added one.
|
||||
|
||||
**Applied remediation**
|
||||
|
||||
1. Added `ISam2GraphHandleImpl.get_pose_key(frame_id: int) -> int` (3 lines + docstring) at `components/c5_state/_isam2_handle.py`, delegating to `self._estimator.key_for_frame(frame_id)`. This is the C5-side canonical name for the same `'x'`-namespace lookup; the parameter `int` is a subset of `key_for_frame`'s declared `UUID | int`.
|
||||
2. Added cross-component conformance test `test_handle_satisfies_c4_isam2_graph_handle_protocol` in `tests/unit/c5_state/test_az381_state_protocol.py` that imports the C4 Protocol and asserts both `isinstance(handle, C4ISam2GraphHandle)` and the delegation call to `estimator.key_for_frame`.
|
||||
3. Updated `ISam2GraphHandleImpl` docstring to record the dual-Protocol satisfaction contract.
|
||||
|
||||
The two Protocols **stay as they are**. The c4 stub remains minimal (C4 imports nothing from C5). The c5 Protocol remains the C5-internal surface. The single concrete impl now satisfies both.
|
||||
|
||||
**Task reference**: AZ-355 (C4 stub), AZ-381 (C5 Protocol + impl), pose_factory (AZ-355 Outcome), state_factory (AZ-381 Outcome).
|
||||
|
||||
### F2: 4 `except: pass` sites in production code without justifying comments (Medium / Maintainability)
|
||||
|
||||
**Locations**:
|
||||
|
||||
| File | Line | Context |
|
||||
|----------------------------------------------------------------------------------------|------|-------------------------------------------------------------------------------------------------------------------------|
|
||||
| `src/gps_denied_onboard/runtime_root/__init__.py` | 545 | Inside `composition_root.takeoff_abort_stop_failed` log path — last-ditch swallow if the logger itself raises |
|
||||
| `src/gps_denied_onboard/components/c8_fc_adapter/pymavlink_ardupilot_adapter.py` | 587 | Best-effort STATUSTEXT emission on source-set switch failure (structured log already captured upstream) |
|
||||
| `src/gps_denied_onboard/components/c8_fc_adapter/msp2_inav_adapter.py` | 143 | `close()` on an already-broken MSP/MAV connection |
|
||||
| `src/gps_denied_onboard/components/c13_fdr/writer.py` | 541 | Cleanup of an empty segment file after a write failure |
|
||||
|
||||
**Description**
|
||||
|
||||
`coderule.mdc` requires: "Never suppress errors silently — no `2>/dev/null`, empty `catch` blocks, bare `except: pass`, or discarded error returns. … If an error is truly safe to ignore, log it or comment why." All four sites were defensible (defensive cleanup or best-effort secondary emission, never primary business logic) but none carried a justifying inline comment.
|
||||
|
||||
**Applied remediation**
|
||||
|
||||
Added a justifying multi-line comment at each of the four sites explaining (a) what the suppressed call is doing, (b) where the primary failure path surfaces, and (c) why losing the secondary signal is safe.
|
||||
|
||||
### F3: 7 cumulative-review triggers were skipped before this run (Low / Process)
|
||||
|
||||
**Location**: `_docs/03_implementation/`
|
||||
|
||||
**Description**
|
||||
|
||||
`implement/SKILL.md` § 14.5 says: "every K completed batches (default K = 3) … invoke `.cursor/skills/code-review/SKILL.md` in cumulative mode. … Output: write the report to `_docs/03_implementation/cumulative_review_batches_[NN-MM]_cycle[N]_report.md`." Across batches 01-22, no `cumulative_review_*` file existed prior to this report. Cumulative review is the mechanism for catching architecture drift, cross-task inconsistency, and contract drift between batches — exactly the class of issue F1 was.
|
||||
|
||||
F1 demonstrates the cost: a single missing method that has lived in the codebase since AZ-381 landed would have been caught on the first cumulative review and resolved with a 1-point fix. By the time it was caught here, it required the same 1-point fix plus a new conformance test plus this writeup.
|
||||
|
||||
**Remediation**
|
||||
|
||||
Going forward, the implement skill's step 14.5 trigger fires at batch 25 (next K=3 boundary after this report). The previously skipped triggers are not replayed — this report consolidates them.
|
||||
|
||||
## Section-by-Phase Notes (cumulative scope)
|
||||
|
||||
### Phase 1 — Context loading
|
||||
Read `_docs/02_document/architecture.md` (ADR-001, ADR-002, ADR-009 noted), `_docs/02_document/module-layout.md` (14 components, 5-layer table), the 44 completed task specs in `_docs/02_tasks/done/`, and the 22 per-batch reports in `_docs/03_implementation/`.
|
||||
|
||||
### Phase 2 — Spec compliance
|
||||
No new findings beyond F1. All 44 completed tasks have batch reports with PASS or PASS_WITH_WARNINGS verdicts; per-batch review verdicts are recorded under `_docs/03_implementation/reviews/`.
|
||||
|
||||
### Phase 3 — Code quality
|
||||
F2 captures the only structural smell. No SOLID violation, no functions over 50 lines without explanation, no obvious DRY violations across the 22 batches. The recent batches (20, 21, 22) introduced helpers (`_validate_takeoff_origin_args`) that the agents themselves flagged as duplication candidates for a future refactor — see batch_22_cycle1_report.md § Self-review findings. Not promoted to a finding here because the agents already surfaced and accepted them.
|
||||
|
||||
### Phase 4 — Security quick-scan
|
||||
No findings. No SQL string interpolation (psycopg parameterized queries throughout c6/c10 work). No `shell=True` in subprocess calls. No hardcoded secrets in source (`.env` patterns absent). MAVLink signing key handling in AZ-395 (batch 17) uses zeroisation per the task spec.
|
||||
|
||||
### Phase 5 — Performance scan
|
||||
No findings. The two performance-sensitive hot paths (`ISam2GraphHandleImpl.add_factor`, `compute_marginals`) carry NFR targets in their task specs and the AZ-382/383/384 batch reports record measured timings within target.
|
||||
|
||||
### Phase 6 — Cross-task consistency
|
||||
F1 is the dominant Phase-6 finding. Other cross-task checks:
|
||||
|
||||
- DTO surface drift across `_types/`: `FlightStateSignal` and `GpsHealth` moved from `_types/nav.py` to `_types/fc.py` as part of AZ-390 (batch 16) — `_types/nav.py:73-77` carries a comment explaining the migration. No stale imports of the old location detected.
|
||||
- `VioOutput` lives in `_types/vio.py` and is referenced by `c1_vio/__init__.py`, `c5_state/*` paths. AZ-331 task spec (todo) says it MUST live in `_types/nav.py` — that constraint is a forward-looking AZ-331 obligation; today's placement is internal to `_types/` and consumers import via `_types.vio` cleanly.
|
||||
- `WgsConverter.horizontal_distance_m` was added by AZ-490 (batch 22) and reuses `latlonalt_to_local_enu`. No duplicate distance helper exists elsewhere.
|
||||
|
||||
### Phase 7 — Architecture compliance
|
||||
1. **Layer direction**: every cross-component import in `src/gps_denied_onboard/` was scanned. All cross-component imports stay within the same component, go through a component's Public API (`__init__.py` re-exports or `interface.py`), or originate from `runtime_root/` (which is allowed to import concrete internals per § 6 of module-layout). No L1 → L2/L3/L4 imports; no L2 → L3/L4 imports.
|
||||
2. **Public API respect**: pose_factory imports `c4_pose._isam2_handle:ISam2GraphHandle` (within-component, fine); state_factory imports `c5_state._isam2_handle:ISam2GraphHandle` (within-component, fine). Neither imports the other's Protocol — the dual-Protocol design preserves component independence. F1 was a conformance gap on the impl side, not a Public-API breach.
|
||||
3. **No new cyclic dependencies**: import graph traced; no cycles introduced across batches 01-22.
|
||||
4. **Same-named Protocols across components**: `ISam2GraphHandle` appears in both `c4_pose` and `c5_state` — this is **intentional dual-Protocol design** per AZ-355 Risk 1 (not duplication). The one concrete impl (`ISam2GraphHandleImpl`) MUST satisfy both Protocols; F1's remediation enforces that contract with a cross-component conformance test.
|
||||
5. **Cross-cutting concerns reimplemented locally**: 14 component files use `time.monotonic_ns` / `time.time_ns` / `time.sleep` directly. This will become a violation when AZ-398 (in `todo/`) lands and declares `Clock` Invariant 2. Today it is **not** a finding because the Clock contract does not yet exist; recorded here so the AZ-398 batch can plan the retrofit scope (~14 files).
|
||||
|
||||
## Files Touched (batches 01-22)
|
||||
|
||||
Computed from `git log --name-only` against the 22 batch commits. Trimmed for brevity; the union covers `src/gps_denied_onboard/` (118 `.py` files) and `tests/unit/` (83 `.py` files) approximately in full, plus selected `_docs/02_document/` updates per AZ-385, AZ-489, AZ-490.
|
||||
|
||||
## Verdict & Gate
|
||||
|
||||
**Post-remediation: PASS.**
|
||||
|
||||
F1 + F2 fixed in this session (see "Applied remediation" sections above). F3 is informational and is addressed by going forward (next trigger at batch 25).
|
||||
|
||||
Files changed in remediation:
|
||||
|
||||
- `src/gps_denied_onboard/components/c5_state/_isam2_handle.py` — added `ISam2GraphHandleImpl.get_pose_key` + updated class docstring
|
||||
- `tests/unit/c5_state/test_az381_state_protocol.py` — added cross-component conformance test
|
||||
- `src/gps_denied_onboard/runtime_root/__init__.py` — comment at line 545
|
||||
- `src/gps_denied_onboard/components/c8_fc_adapter/pymavlink_ardupilot_adapter.py` — comment at line 587
|
||||
- `src/gps_denied_onboard/components/c8_fc_adapter/msp2_inav_adapter.py` — comment at line 143
|
||||
- `src/gps_denied_onboard/components/c13_fdr/writer.py` — comment at line 541
|
||||
|
||||
## Post-mortem — Review framing error (recorded for `meta-rule.mdc` Long Investigation Retrospective)
|
||||
|
||||
The first draft of this report mis-categorised F1 as "duplicate Protocols / Protocol drift" and proposed unifying the two Protocols. That framing was wrong: the dual-Protocol design is intentional (and explicitly documented in the c4 stub's docstring + AZ-355 Risk 1). The bug was narrower — the C5 impl was missing one method needed to satisfy the C4 stub.
|
||||
|
||||
Bottleneck: I treated two same-named Protocols in different components as accidental duplication without first reading each Protocol's docstring rationale. The c4 stub explicitly states "the C5 concrete handle is strictly a superset of this surface" — the docstring rationale would have steered the diagnosis correctly on first read.
|
||||
|
||||
Lesson: when scanning for duplicate symbols across components in Phase 7 #4, read each symbol's docstring rationale before classifying as "duplication". If the docstring explicitly justifies a consumer-side/producer-side split, the finding is NOT duplication — it's a possible conformance gap on the IMPL side (different category, different fix, different size).
|
||||
Reference in New Issue
Block a user