From 3a61a4f5bff03c6639aba831715e8d21adfcdb01 Mon Sep 17 00:00:00 2001 From: Oleksandr Bezdieniezhnykh Date: Wed, 13 May 2026 06:40:09 +0300 Subject: [PATCH] chore: cumulative review batches 37-39 (PASS_WITH_WARNINGS) Captures the C11 operator-side trio landing (AZ-317/318/319) plus the C10 build-orchestrator close-out (AZ-325) and the AZ-515 canonical-hash extraction. Three Low findings, all documentation-level drift between spec text and as-built code; none block Batch 40. Resolves prior F1 (AZ-515 closed the verifier-into-builder private import). Co-authored-by: Cursor --- ...tive_review_batches_37-39_cycle1_report.md | 115 ++++++++++++++++++ _docs/_autodev_state.md | 2 +- 2 files changed, 116 insertions(+), 1 deletion(-) create mode 100644 _docs/03_implementation/cumulative_review_batches_37-39_cycle1_report.md diff --git a/_docs/03_implementation/cumulative_review_batches_37-39_cycle1_report.md b/_docs/03_implementation/cumulative_review_batches_37-39_cycle1_report.md new file mode 100644 index 0000000..ee7d3ce --- /dev/null +++ b/_docs/03_implementation/cumulative_review_batches_37-39_cycle1_report.md @@ -0,0 +1,115 @@ +# Cumulative Code Review — Batches 37–39 / Cycle 1 + +**Date**: 2026-05-13 +**Mode**: Cumulative (all 7 phases, emphasis on Phases 6 + 7) +**Batches covered**: 37, 38, 39 +**Tasks covered**: AZ-325 (C10 CacheProvisioner), AZ-515 (C10 canonical-hash extraction), AZ-317 (C11 FlightStateGate), AZ-318 (C11 PerFlightKeyManager), AZ-319 (C11 HttpTileUploader) +**Changed files in scope**: 12 production + 6 tests + 5 docs + 1 archive (see "Scope" below) + +| Domain | Files (changed since cumulative_review_batches_34-36_cycle1_report.md) | +|--------|------------------------------------------------------------------------| +| c10_provisioning (production) | `provisioner.py` (new, AZ-325), `_canonical_hash.py` (new, AZ-515 — extracted shared aggregate-hash module), `manifest_builder.py` (refactor — now imports `_canonical_hash`), `manifest_verifier.py` (refactor — now imports `_canonical_hash`), `__init__.py` (re-exports `CacheProvisionerImpl`) | +| c11_tile_manager (production) | `_types.py` (new — `FlightStateSignal`, `PublicKeyFingerprint`, `IngestStatus`, `UploadOutcome`, `UploadRequest`, `PerTileStatus`, `UploadBatchReport`), `errors.py` (new — `TileManagerError` parent + 5 subclasses), `interface.py` (extended — `FlightStateSource`, real `TileUploader` Protocol), `flight_state_gate.py` (new, AZ-317), `signing_key.py` (new, AZ-318), `config.py` (new, AZ-319 — `C11Config`), `tile_uploader.py` (new, AZ-319 — `HttpTileUploader` + 3 consumer-side cuts over c6), `__init__.py` (full upload-side surface re-exports + `register_component_block`) | +| runtime_root (composition root) | `c11_factory.py` (new — `build_flight_state_gate`, `build_per_flight_key_manager`, `build_tile_uploader`) | +| fdr_client (cross-cutting) | `records.py` (5 new `KNOWN_PAYLOAD_KEYS` entries: `c11.upload.session.key.public`, `c11.upload.signature_rejected`, `c11.upload.tile.queued`, `c11.upload.tile.rejected`, `c11.upload.batch.complete`) | +| Tests | `tests/unit/c10_provisioning/test_cache_provisioner.py` (new, AZ-325), `tests/unit/c11_tile_manager/test_flight_state_gate.py` (new, AZ-317 — 13 tests), `tests/unit/c11_tile_manager/test_signing_key.py` (new, AZ-318 — 13 tests), `tests/unit/c11_tile_manager/test_tile_uploader.py` (new, AZ-319 — 15 tests), `tests/unit/c11_tile_manager/test_protocol_conformance.py` (new, AZ-319 AC-12), `tests/unit/test_az272_fdr_record_schema.py` (5 fixture additions) | +| Docs | `_docs/02_tasks/done/AZ-325_*`, `done/AZ-515_*`, `done/AZ-317_*`, `done/AZ-318_*`, `done/AZ-319_*` (archived from `todo/`); per-batch reports + reviews under `_docs/03_implementation/` | + +**Verdict**: **PASS_WITH_WARNINGS** + +## Summary + +No Critical or High findings. Three findings total: all Low (one Maintainability / two Spec-Gap). The dominant achievement of this window is the **end-to-end landing of the C11 operator-side upload path** (gate → ephemeral signing → multipart POST → mark uploaded → FDR alert), the **closure of the C10 build orchestrator** (CacheProvisioner) for E-C12 consumption, and the **resolution of the prior cumulative review's F1** via AZ-515's `_canonical_hash.py` extraction. + +### Architecture-level outcomes + +1. **C11 operator-side trio is fully wired.** AZ-317 (gate), AZ-318 (per-flight Ed25519 ephemeral key), and AZ-319 (`HttpTileUploader`) compose at the operator-binary composition root through `build_flight_state_gate` / `build_per_flight_key_manager` / `build_tile_uploader`. The contract surface (`TileUploader` Protocol + `UploadRequest` / `UploadBatchReport` / `PerTileStatus` / `IngestStatus` / `UploadOutcome` DTOs) is canonical and re-exported through `c11_tile_manager.__init__`. Three D-PROJ-2 ingest contract concerns are handled in code: canonical signing-payload bytes (deterministic SHA-256 over a frozen field order), `Retry-After` parsing for both RFC 7231 forms, and per-tile signature-rejection routing to `PerFlightKeyManager.record_signature_rejection`. + +2. **Consumer-side Protocol cut pattern continues to scale.** AZ-319 introduces three more cuts (`_TilePixelHandleLike`, `_TileBytesReader`, `_PendingMetadataReader`) over c6, all local to `c11_tile_manager.tile_uploader`. The composition root binds the concrete c6 implementations through `build_tile_uploader`. AZ-270 lint (`test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies`) remains green; zero `components.X` cross-component imports inside `src/gps_denied_onboard/components/**/*.py`. Cumulative count of consumer-side cuts in production code: **10+** (4 in c10_provisioning from AZ-322/323/324, 3 in c11_tile_manager.tile_uploader from AZ-319, plus the FlightStateSource cut from AZ-317 and the engine cuts from AZ-321). + +3. **AZ-515 closed F1 from the prior cumulative review.** The previous window's F1 (verifier reaching into builder's private `_aggregate_tile_hash`) is resolved by AZ-515 — the canonical aggregate-hash helper now lives at `c10_provisioning/_canonical_hash.py` and is imported symmetrically by `manifest_builder.py` and `manifest_verifier.py`. The intra-component shared utility makes the trust-chain glue between AZ-323 and AZ-324 explicit. No public API impact. + +4. **FDR registry expanded by five kinds.** All five C11 upload-side records (`session.key.public`, `signature_rejected`, `tile.queued`, `tile.rejected`, `batch.complete`) are registered in `KNOWN_PAYLOAD_KEYS` with the AZ-272 schema-roundtrip fixtures wired in lockstep. The cross-record correlation keys are consistent (`flight_id`, `fingerprint`, `batch_uuid`, `observed_at_iso`) so a downstream auditor can join per-tile events back to the batch summary and forward to the per-flight key envelope. + +### Phase 6 — Cross-Task Consistency + +- **AZ-317 ↔ AZ-318 ↔ AZ-319 wiring contract**: `HttpTileUploader.upload_pending_tiles` honours the FROZEN order `gate → start_session → enumerate → batch loop → finally end_session`. AC-2 (gate blocks before ANY work) and AC-5/AC-6 (zeroisation on every exit path) are both asserted in the AZ-319 unit suite. The fingerprint returned by `start_session` is the same value carried in every per-tile FDR record + the final `batch.complete` record, providing the safety officer's correlation key. +- **AZ-319 ↔ AZ-272 schema contract**: every key the uploader emits in any FDR payload is a subset of `KNOWN_PAYLOAD_KEYS[kind]`. The AZ-272 schema test fixtures cover the three new kinds; the c7 conformance pattern (`set(payload.keys()) - {"extra"} <= expected_keys`) would pass for the uploader if applied (uploader does not have its own conformance test in this style — see F2 below). +- **AZ-319 ↔ AZ-270 cross-component lint**: the consumer-side cuts in `tile_uploader.py` over c6 surfaces (`_TileBytesReader`, `_PendingMetadataReader`, `_TilePixelHandleLike`) keep the c11 component free of `from gps_denied_onboard.components.c6_tile_cache import …` statements. AZ-270 AST scan is green. +- **AZ-325 ↔ AZ-515 hash-identity contract**: the build-identity hash CacheProvisioner uses for idempotence checks is byte-aligned with the manifest hash AZ-323 wrote, because both call into the SAME `_canonical_hash.aggregate_tile_hash` helper. Re-reading AZ-325's `provisioner.py` confirms the import path now points to `_canonical_hash` (post-AZ-515), not `manifest_builder._aggregate_tile_hash`. +- **C11 component layout consistency with C10**: c11 follows the same `_types.py` / `errors.py` / `interface.py` / `.py` / `__init__.py` (with `register_component_block`) layout as c10, plus `config.py` for the per-component config block. The split keeps the public API surface declared in one place per component. + +### Phase 7 — Architecture Compliance + +- **Layer direction**: c11 production code imports only from `_types/*` (none used yet — c11 declares its DTOs locally), `helpers/*` (none used), `config`, `logging`, `clock` (via `WallClock` indirection), `fdr_client`, plus `httpx` and `cryptography` (third-party, already pinned). All Layer 1 or lower. No upward imports. +- **Public API respect**: `runtime_root/c11_factory.py` is the single cross-component seam for c11. The AZ-270 lint exempts `runtime_root/*`. No `components.c6_*` import appears anywhere inside `components/c11_tile_manager/**/*.py`. +- **No new cyclic module dependencies**: verified by import grep across `src/gps_denied_onboard/components/`. +- **Duplicate symbols across components**: no new duplicates introduced. The two active `_iso_ts_now` copies in c7 (carryover from prior windows) remain — AZ-508 still tracks the cross-component consolidation. C11 introduces a local `_iso_now` helper in `tile_uploader.py` (UTC RFC 3339); see F1 for the consolidation suggestion. +- **Cross-cutting concerns reuse**: `c11_tile_manager` correctly uses `fdr_client.make_fdr_client`, `logging.get_logger`, `clock.wall_clock.WallClock` (via the deferred import in `_default_sleep`), and `config.schema.register_component_block`. No re-implementations. +- **Time-handling discipline**: `tile_uploader.py` does NOT call `time.sleep` directly; it routes through `WallClock.sleep_until_ns` (deferred import inside the `_default_sleep` helper). The AC-4 AST scan over `components/` from AZ-398 stays clean. + +## Findings + +| # | Severity | Category | File:Line | Title | +|---|----------|----------|-----------|-------| +| 1 | Low | Maintainability | `c11_tile_manager/tile_uploader.py:739` | Local `_iso_now` helper duplicates the `_iso_ts_now` pattern that AZ-508 plans to consolidate | +| 2 | Low | Spec-Gap | `c11_tile_manager/tile_uploader.py` constructor | Constructor takes `sleep` callable rather than the spec-listed `clock: Clock` parameter | +| 3 | Low | Spec-Gap | `_docs/02_tasks/done/AZ-319_*.md` Outcome §, `tile_uploader.py:328-334` | Spec text describes `outcome = failure` as a return value for gate / auth / persistent-5xx scenarios; impl raises in those cases (and writes `failure` into the FDR `batch.complete` record) | + +### Finding Details + +**F1: `_iso_now` helper is yet another `_iso_ts_now` copy** (Low / Maintainability) + +- Location: `src/gps_denied_onboard/components/c11_tile_manager/tile_uploader.py:739` — `def _iso_now() -> str: return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.%fZ")`. +- Description: AZ-508 (in `_docs/02_tasks/todo/`) tracks the consolidation of all `_iso_ts_now` definitions across the codebase into a single `helpers/iso_timestamps.py` module. The c7 modules (`onnx_trt_ep_runtime.py`, `thermal_publisher.py`) are the existing duplication sites; c11 now adds a third active site with the same body. The drift risk is the format string itself — the c11 helper uses `strftime("%Y-%m-%dT%H:%M:%S.%fZ")` (forced UTC suffix), while c6's now-consolidated `_timestamp.iso_ts_now` uses `.isoformat()` (canonical RFC 3339 with `+00:00`). The two formats are NOT byte-identical for the same input. +- Suggestion: refresh AZ-508's "Problem" / "Outcome" / "Included" sections to add the new c11 site, and standardise on ONE format (recommend `.isoformat(timespec="microseconds")` to match the existing AZ-272 schema fixture `"2025-01-15T08:00:00.000000+00:00"` style). This also feeds back into AZ-318's `record_signature_rejection` logic if it uses the c11 helper. +- Recommendation: refresh AZ-508; do not gate downstream batches on this. +- Task: AZ-508 (existing). + +**F2: Constructor signature deviates from spec — `sleep` callable instead of `clock: Clock`** (Low / Spec-Gap) + +- Location: `src/gps_denied_onboard/components/c11_tile_manager/tile_uploader.py:257-269` — `__init__(..., sleep: Any = None)`. +- Description: The AZ-319 task spec (now archived in `_docs/02_tasks/done/AZ-319_c11_tile_uploader.md`) lists `clock: Clock` as the constructor parameter for the timing-sensitive 429 / 5xx backoff path. The implementation accepts a callable `sleep` parameter instead, defaulting to a `WallClock`-routed helper. Reasoning is sound (the uploader only ever needs a sleep primitive — never `monotonic_ns` or `time_ns`), and the AZ-398 invariant (no `time.sleep` in `components/`) still holds because the default helper routes through `WallClock.sleep_until_ns`. But the deviation is a documented mismatch between the spec text and the as-built constructor. +- Suggestion: pick one of: + - (a) Update the AZ-319 spec text to match the implementation — `sleep: Callable[[float], None] = _default_sleep` — and document the rationale in the contract Shape section. + - (b) Refactor the constructor to take `clock: Clock` and call `self._clock.sleep_until_ns(...)` directly. +- Recommendation: (a) — the contract is the source of truth, and threading the full Clock Protocol through a class that only needs sleep is unnecessary indirection. +- Task: file as 1-pt task hygiene PBI against the AZ-319 spec text (no production code change). + +**F3: Spec text describes returned `outcome = failure`; impl raises instead** (Low / Spec-Gap) + +- Location: + - Spec: `_docs/02_tasks/done/AZ-319_c11_tile_uploader.md` § Outcome lines 60-66 ("`outcome = failure` if the gate blocked, the API key was invalid, or zero tiles could be POSTed"). + - Impl: `src/gps_denied_onboard/components/c11_tile_manager/tile_uploader.py:328-334` (raises `SatelliteProviderError` / `RateLimitedError` / `FlightStateNotOnGroundError`); `:670-693` (`_emit_batch_complete` writes `outcome = failure` into the FDR record on the exception path via the `try/except/finally` pattern). +- Description: The implementation NEVER returns a `UploadBatchReport(outcome=FAILURE)` — every failure path raises a typed exception. The `UploadOutcome.FAILURE` enum value IS used, but only inside the FDR `c11.upload.batch.complete` record emitted from the `finally` block. AC-2 (gate blocks → raises), AC-9 (5x 503 → raises), AC-10 (401 → raises) all assert the raise behaviour, so the impl is internally consistent and passes its own AC suite. The spec text drift means a reader of the spec would expect to handle a `FAILURE` outcome in the returned report, when in fact they need a `try/except` around the call. +- Suggestion: update the AZ-319 spec text to clarify that `UploadOutcome.FAILURE` is an FDR-only value (returned report is always `SUCCESS` or `PARTIAL` because failure paths raise). The contract document `_docs/02_document/contracts/c11_tilemanager/tile_uploader.md` should be the canonical source for this — verify it matches the impl. +- Recommendation: 1-pt task hygiene PBI to align spec + contract + impl text. +- Task: file as 1-pt PBI; not blocking. + +## Baseline Delta + +`_docs/02_document/architecture_compliance_baseline.md` does not exist (greenfield project). The Baseline Delta section is omitted per `code-review/SKILL.md` "Baseline delta". + +## Verdict Logic + +- 0 Critical +- 0 High +- 0 Medium +- 3 Low (1 Maintainability, 2 Spec-Gap) + +→ **PASS_WITH_WARNINGS**: only Low findings; all three are documentation-level drift between spec text and implementation, with no impact on production behaviour. None block progression to Batch 40. F1 already has a tracking PBI (AZ-508) that needs a small refresh; F2 + F3 each warrant a 1-pt task-hygiene PBI to align the spec with the as-built constructor signature and outcome-vs-raise semantics. + +## Test Suite Snapshot + +- Full unit suite after Batch 39: **1404 passed**, 80 skipped, 0 failed. +- Skips are environment-gated (Docker compose, CUDA, TensorRT, Tier-2 Jetson hardware, `actionlint`); none are AZ-317/318/319/325/515-related. +- C11 unit suite: 41 tests passing (13 AZ-317 + 13 AZ-318 + 17 AZ-319 / Protocol conformance + 5 from prior C11 work). +- C10 unit suite: AZ-325 / AZ-515 tests covered in the per-batch reports for batches 37 + 38. + +## Carryover Status Against 34–36 Review + +| Previous finding | Severity | Status after batch 39 | +|---|---|---| +| F1 (verifier reaches into builder's private `_aggregate_tile_hash`) | Low / Maintainability | **RESOLVED** by AZ-515 (`c10_provisioning/_canonical_hash.py` extracted; both builder + verifier import from there). | +| F2 (AZ-508 task spec is stale) | Low / Maintainability | **CARRIED OVER** — still pending; the new c11 `_iso_now` helper (this review's F1) makes the refresh more urgent. | +| F3 (consumer-side Protocol cut pattern un-documented in `architecture.md`) | Low / Maintainability | **CARRIED OVER** — pattern now 10+ active instances in production. Codified in `module-layout.md` Rule 9 but architecture.md still lacks a dedicated section. | diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index 4a4e48f..76168df 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -13,4 +13,4 @@ retry_count: 0 cycle: 1 tracker: jira last_completed_batch: 39 -last_cumulative_review: batches_34-36 +last_cumulative_review: batches_37-39