mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 06:51:12 +00:00
chore: cumulative review batches 37-39 (PASS_WITH_WARNINGS)
Captures the C11 operator-side trio landing (AZ-317/318/319) plus the C10 build-orchestrator close-out (AZ-325) and the AZ-515 canonical-hash extraction. Three Low findings, all documentation-level drift between spec text and as-built code; none block Batch 40. Resolves prior F1 (AZ-515 closed the verifier-into-builder private import). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,115 @@
|
||||
# Cumulative Code Review — Batches 37–39 / Cycle 1
|
||||
|
||||
**Date**: 2026-05-13
|
||||
**Mode**: Cumulative (all 7 phases, emphasis on Phases 6 + 7)
|
||||
**Batches covered**: 37, 38, 39
|
||||
**Tasks covered**: AZ-325 (C10 CacheProvisioner), AZ-515 (C10 canonical-hash extraction), AZ-317 (C11 FlightStateGate), AZ-318 (C11 PerFlightKeyManager), AZ-319 (C11 HttpTileUploader)
|
||||
**Changed files in scope**: 12 production + 6 tests + 5 docs + 1 archive (see "Scope" below)
|
||||
|
||||
| Domain | Files (changed since cumulative_review_batches_34-36_cycle1_report.md) |
|
||||
|--------|------------------------------------------------------------------------|
|
||||
| c10_provisioning (production) | `provisioner.py` (new, AZ-325), `_canonical_hash.py` (new, AZ-515 — extracted shared aggregate-hash module), `manifest_builder.py` (refactor — now imports `_canonical_hash`), `manifest_verifier.py` (refactor — now imports `_canonical_hash`), `__init__.py` (re-exports `CacheProvisionerImpl`) |
|
||||
| c11_tile_manager (production) | `_types.py` (new — `FlightStateSignal`, `PublicKeyFingerprint`, `IngestStatus`, `UploadOutcome`, `UploadRequest`, `PerTileStatus`, `UploadBatchReport`), `errors.py` (new — `TileManagerError` parent + 5 subclasses), `interface.py` (extended — `FlightStateSource`, real `TileUploader` Protocol), `flight_state_gate.py` (new, AZ-317), `signing_key.py` (new, AZ-318), `config.py` (new, AZ-319 — `C11Config`), `tile_uploader.py` (new, AZ-319 — `HttpTileUploader` + 3 consumer-side cuts over c6), `__init__.py` (full upload-side surface re-exports + `register_component_block`) |
|
||||
| runtime_root (composition root) | `c11_factory.py` (new — `build_flight_state_gate`, `build_per_flight_key_manager`, `build_tile_uploader`) |
|
||||
| fdr_client (cross-cutting) | `records.py` (5 new `KNOWN_PAYLOAD_KEYS` entries: `c11.upload.session.key.public`, `c11.upload.signature_rejected`, `c11.upload.tile.queued`, `c11.upload.tile.rejected`, `c11.upload.batch.complete`) |
|
||||
| Tests | `tests/unit/c10_provisioning/test_cache_provisioner.py` (new, AZ-325), `tests/unit/c11_tile_manager/test_flight_state_gate.py` (new, AZ-317 — 13 tests), `tests/unit/c11_tile_manager/test_signing_key.py` (new, AZ-318 — 13 tests), `tests/unit/c11_tile_manager/test_tile_uploader.py` (new, AZ-319 — 15 tests), `tests/unit/c11_tile_manager/test_protocol_conformance.py` (new, AZ-319 AC-12), `tests/unit/test_az272_fdr_record_schema.py` (5 fixture additions) |
|
||||
| Docs | `_docs/02_tasks/done/AZ-325_*`, `done/AZ-515_*`, `done/AZ-317_*`, `done/AZ-318_*`, `done/AZ-319_*` (archived from `todo/`); per-batch reports + reviews under `_docs/03_implementation/` |
|
||||
|
||||
**Verdict**: **PASS_WITH_WARNINGS**
|
||||
|
||||
## Summary
|
||||
|
||||
No Critical or High findings. Three findings total: all Low (one Maintainability / two Spec-Gap). The dominant achievement of this window is the **end-to-end landing of the C11 operator-side upload path** (gate → ephemeral signing → multipart POST → mark uploaded → FDR alert), the **closure of the C10 build orchestrator** (CacheProvisioner) for E-C12 consumption, and the **resolution of the prior cumulative review's F1** via AZ-515's `_canonical_hash.py` extraction.
|
||||
|
||||
### Architecture-level outcomes
|
||||
|
||||
1. **C11 operator-side trio is fully wired.** AZ-317 (gate), AZ-318 (per-flight Ed25519 ephemeral key), and AZ-319 (`HttpTileUploader`) compose at the operator-binary composition root through `build_flight_state_gate` / `build_per_flight_key_manager` / `build_tile_uploader`. The contract surface (`TileUploader` Protocol + `UploadRequest` / `UploadBatchReport` / `PerTileStatus` / `IngestStatus` / `UploadOutcome` DTOs) is canonical and re-exported through `c11_tile_manager.__init__`. Three D-PROJ-2 ingest contract concerns are handled in code: canonical signing-payload bytes (deterministic SHA-256 over a frozen field order), `Retry-After` parsing for both RFC 7231 forms, and per-tile signature-rejection routing to `PerFlightKeyManager.record_signature_rejection`.
|
||||
|
||||
2. **Consumer-side Protocol cut pattern continues to scale.** AZ-319 introduces three more cuts (`_TilePixelHandleLike`, `_TileBytesReader`, `_PendingMetadataReader`) over c6, all local to `c11_tile_manager.tile_uploader`. The composition root binds the concrete c6 implementations through `build_tile_uploader`. AZ-270 lint (`test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies`) remains green; zero `components.X` cross-component imports inside `src/gps_denied_onboard/components/**/*.py`. Cumulative count of consumer-side cuts in production code: **10+** (4 in c10_provisioning from AZ-322/323/324, 3 in c11_tile_manager.tile_uploader from AZ-319, plus the FlightStateSource cut from AZ-317 and the engine cuts from AZ-321).
|
||||
|
||||
3. **AZ-515 closed F1 from the prior cumulative review.** The previous window's F1 (verifier reaching into builder's private `_aggregate_tile_hash`) is resolved by AZ-515 — the canonical aggregate-hash helper now lives at `c10_provisioning/_canonical_hash.py` and is imported symmetrically by `manifest_builder.py` and `manifest_verifier.py`. The intra-component shared utility makes the trust-chain glue between AZ-323 and AZ-324 explicit. No public API impact.
|
||||
|
||||
4. **FDR registry expanded by five kinds.** All five C11 upload-side records (`session.key.public`, `signature_rejected`, `tile.queued`, `tile.rejected`, `batch.complete`) are registered in `KNOWN_PAYLOAD_KEYS` with the AZ-272 schema-roundtrip fixtures wired in lockstep. The cross-record correlation keys are consistent (`flight_id`, `fingerprint`, `batch_uuid`, `observed_at_iso`) so a downstream auditor can join per-tile events back to the batch summary and forward to the per-flight key envelope.
|
||||
|
||||
### Phase 6 — Cross-Task Consistency
|
||||
|
||||
- **AZ-317 ↔ AZ-318 ↔ AZ-319 wiring contract**: `HttpTileUploader.upload_pending_tiles` honours the FROZEN order `gate → start_session → enumerate → batch loop → finally end_session`. AC-2 (gate blocks before ANY work) and AC-5/AC-6 (zeroisation on every exit path) are both asserted in the AZ-319 unit suite. The fingerprint returned by `start_session` is the same value carried in every per-tile FDR record + the final `batch.complete` record, providing the safety officer's correlation key.
|
||||
- **AZ-319 ↔ AZ-272 schema contract**: every key the uploader emits in any FDR payload is a subset of `KNOWN_PAYLOAD_KEYS[kind]`. The AZ-272 schema test fixtures cover the three new kinds; the c7 conformance pattern (`set(payload.keys()) - {"extra"} <= expected_keys`) would pass for the uploader if applied (uploader does not have its own conformance test in this style — see F2 below).
|
||||
- **AZ-319 ↔ AZ-270 cross-component lint**: the consumer-side cuts in `tile_uploader.py` over c6 surfaces (`_TileBytesReader`, `_PendingMetadataReader`, `_TilePixelHandleLike`) keep the c11 component free of `from gps_denied_onboard.components.c6_tile_cache import …` statements. AZ-270 AST scan is green.
|
||||
- **AZ-325 ↔ AZ-515 hash-identity contract**: the build-identity hash CacheProvisioner uses for idempotence checks is byte-aligned with the manifest hash AZ-323 wrote, because both call into the SAME `_canonical_hash.aggregate_tile_hash` helper. Re-reading AZ-325's `provisioner.py` confirms the import path now points to `_canonical_hash` (post-AZ-515), not `manifest_builder._aggregate_tile_hash`.
|
||||
- **C11 component layout consistency with C10**: c11 follows the same `_types.py` / `errors.py` / `interface.py` / `<concrete>.py` / `__init__.py` (with `register_component_block`) layout as c10, plus `config.py` for the per-component config block. The split keeps the public API surface declared in one place per component.
|
||||
|
||||
### Phase 7 — Architecture Compliance
|
||||
|
||||
- **Layer direction**: c11 production code imports only from `_types/*` (none used yet — c11 declares its DTOs locally), `helpers/*` (none used), `config`, `logging`, `clock` (via `WallClock` indirection), `fdr_client`, plus `httpx` and `cryptography` (third-party, already pinned). All Layer 1 or lower. No upward imports.
|
||||
- **Public API respect**: `runtime_root/c11_factory.py` is the single cross-component seam for c11. The AZ-270 lint exempts `runtime_root/*`. No `components.c6_*` import appears anywhere inside `components/c11_tile_manager/**/*.py`.
|
||||
- **No new cyclic module dependencies**: verified by import grep across `src/gps_denied_onboard/components/`.
|
||||
- **Duplicate symbols across components**: no new duplicates introduced. The two active `_iso_ts_now` copies in c7 (carryover from prior windows) remain — AZ-508 still tracks the cross-component consolidation. C11 introduces a local `_iso_now` helper in `tile_uploader.py` (UTC RFC 3339); see F1 for the consolidation suggestion.
|
||||
- **Cross-cutting concerns reuse**: `c11_tile_manager` correctly uses `fdr_client.make_fdr_client`, `logging.get_logger`, `clock.wall_clock.WallClock` (via the deferred import in `_default_sleep`), and `config.schema.register_component_block`. No re-implementations.
|
||||
- **Time-handling discipline**: `tile_uploader.py` does NOT call `time.sleep` directly; it routes through `WallClock.sleep_until_ns` (deferred import inside the `_default_sleep` helper). The AC-4 AST scan over `components/` from AZ-398 stays clean.
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title |
|
||||
|---|----------|----------|-----------|-------|
|
||||
| 1 | Low | Maintainability | `c11_tile_manager/tile_uploader.py:739` | Local `_iso_now` helper duplicates the `_iso_ts_now` pattern that AZ-508 plans to consolidate |
|
||||
| 2 | Low | Spec-Gap | `c11_tile_manager/tile_uploader.py` constructor | Constructor takes `sleep` callable rather than the spec-listed `clock: Clock` parameter |
|
||||
| 3 | Low | Spec-Gap | `_docs/02_tasks/done/AZ-319_*.md` Outcome §, `tile_uploader.py:328-334` | Spec text describes `outcome = failure` as a return value for gate / auth / persistent-5xx scenarios; impl raises in those cases (and writes `failure` into the FDR `batch.complete` record) |
|
||||
|
||||
### Finding Details
|
||||
|
||||
**F1: `_iso_now` helper is yet another `_iso_ts_now` copy** (Low / Maintainability)
|
||||
|
||||
- Location: `src/gps_denied_onboard/components/c11_tile_manager/tile_uploader.py:739` — `def _iso_now() -> str: return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.%fZ")`.
|
||||
- Description: AZ-508 (in `_docs/02_tasks/todo/`) tracks the consolidation of all `_iso_ts_now` definitions across the codebase into a single `helpers/iso_timestamps.py` module. The c7 modules (`onnx_trt_ep_runtime.py`, `thermal_publisher.py`) are the existing duplication sites; c11 now adds a third active site with the same body. The drift risk is the format string itself — the c11 helper uses `strftime("%Y-%m-%dT%H:%M:%S.%fZ")` (forced UTC suffix), while c6's now-consolidated `_timestamp.iso_ts_now` uses `.isoformat()` (canonical RFC 3339 with `+00:00`). The two formats are NOT byte-identical for the same input.
|
||||
- Suggestion: refresh AZ-508's "Problem" / "Outcome" / "Included" sections to add the new c11 site, and standardise on ONE format (recommend `.isoformat(timespec="microseconds")` to match the existing AZ-272 schema fixture `"2025-01-15T08:00:00.000000+00:00"` style). This also feeds back into AZ-318's `record_signature_rejection` logic if it uses the c11 helper.
|
||||
- Recommendation: refresh AZ-508; do not gate downstream batches on this.
|
||||
- Task: AZ-508 (existing).
|
||||
|
||||
**F2: Constructor signature deviates from spec — `sleep` callable instead of `clock: Clock`** (Low / Spec-Gap)
|
||||
|
||||
- Location: `src/gps_denied_onboard/components/c11_tile_manager/tile_uploader.py:257-269` — `__init__(..., sleep: Any = None)`.
|
||||
- Description: The AZ-319 task spec (now archived in `_docs/02_tasks/done/AZ-319_c11_tile_uploader.md`) lists `clock: Clock` as the constructor parameter for the timing-sensitive 429 / 5xx backoff path. The implementation accepts a callable `sleep` parameter instead, defaulting to a `WallClock`-routed helper. Reasoning is sound (the uploader only ever needs a sleep primitive — never `monotonic_ns` or `time_ns`), and the AZ-398 invariant (no `time.sleep` in `components/`) still holds because the default helper routes through `WallClock.sleep_until_ns`. But the deviation is a documented mismatch between the spec text and the as-built constructor.
|
||||
- Suggestion: pick one of:
|
||||
- (a) Update the AZ-319 spec text to match the implementation — `sleep: Callable[[float], None] = _default_sleep` — and document the rationale in the contract Shape section.
|
||||
- (b) Refactor the constructor to take `clock: Clock` and call `self._clock.sleep_until_ns(...)` directly.
|
||||
- Recommendation: (a) — the contract is the source of truth, and threading the full Clock Protocol through a class that only needs sleep is unnecessary indirection.
|
||||
- Task: file as 1-pt task hygiene PBI against the AZ-319 spec text (no production code change).
|
||||
|
||||
**F3: Spec text describes returned `outcome = failure`; impl raises instead** (Low / Spec-Gap)
|
||||
|
||||
- Location:
|
||||
- Spec: `_docs/02_tasks/done/AZ-319_c11_tile_uploader.md` § Outcome lines 60-66 ("`outcome = failure` if the gate blocked, the API key was invalid, or zero tiles could be POSTed").
|
||||
- Impl: `src/gps_denied_onboard/components/c11_tile_manager/tile_uploader.py:328-334` (raises `SatelliteProviderError` / `RateLimitedError` / `FlightStateNotOnGroundError`); `:670-693` (`_emit_batch_complete` writes `outcome = failure` into the FDR record on the exception path via the `try/except/finally` pattern).
|
||||
- Description: The implementation NEVER returns a `UploadBatchReport(outcome=FAILURE)` — every failure path raises a typed exception. The `UploadOutcome.FAILURE` enum value IS used, but only inside the FDR `c11.upload.batch.complete` record emitted from the `finally` block. AC-2 (gate blocks → raises), AC-9 (5x 503 → raises), AC-10 (401 → raises) all assert the raise behaviour, so the impl is internally consistent and passes its own AC suite. The spec text drift means a reader of the spec would expect to handle a `FAILURE` outcome in the returned report, when in fact they need a `try/except` around the call.
|
||||
- Suggestion: update the AZ-319 spec text to clarify that `UploadOutcome.FAILURE` is an FDR-only value (returned report is always `SUCCESS` or `PARTIAL` because failure paths raise). The contract document `_docs/02_document/contracts/c11_tilemanager/tile_uploader.md` should be the canonical source for this — verify it matches the impl.
|
||||
- Recommendation: 1-pt task hygiene PBI to align spec + contract + impl text.
|
||||
- Task: file as 1-pt PBI; not blocking.
|
||||
|
||||
## Baseline Delta
|
||||
|
||||
`_docs/02_document/architecture_compliance_baseline.md` does not exist (greenfield project). The Baseline Delta section is omitted per `code-review/SKILL.md` "Baseline delta".
|
||||
|
||||
## Verdict Logic
|
||||
|
||||
- 0 Critical
|
||||
- 0 High
|
||||
- 0 Medium
|
||||
- 3 Low (1 Maintainability, 2 Spec-Gap)
|
||||
|
||||
→ **PASS_WITH_WARNINGS**: only Low findings; all three are documentation-level drift between spec text and implementation, with no impact on production behaviour. None block progression to Batch 40. F1 already has a tracking PBI (AZ-508) that needs a small refresh; F2 + F3 each warrant a 1-pt task-hygiene PBI to align the spec with the as-built constructor signature and outcome-vs-raise semantics.
|
||||
|
||||
## Test Suite Snapshot
|
||||
|
||||
- Full unit suite after Batch 39: **1404 passed**, 80 skipped, 0 failed.
|
||||
- Skips are environment-gated (Docker compose, CUDA, TensorRT, Tier-2 Jetson hardware, `actionlint`); none are AZ-317/318/319/325/515-related.
|
||||
- C11 unit suite: 41 tests passing (13 AZ-317 + 13 AZ-318 + 17 AZ-319 / Protocol conformance + 5 from prior C11 work).
|
||||
- C10 unit suite: AZ-325 / AZ-515 tests covered in the per-batch reports for batches 37 + 38.
|
||||
|
||||
## Carryover Status Against 34–36 Review
|
||||
|
||||
| Previous finding | Severity | Status after batch 39 |
|
||||
|---|---|---|
|
||||
| F1 (verifier reaches into builder's private `_aggregate_tile_hash`) | Low / Maintainability | **RESOLVED** by AZ-515 (`c10_provisioning/_canonical_hash.py` extracted; both builder + verifier import from there). |
|
||||
| F2 (AZ-508 task spec is stale) | Low / Maintainability | **CARRIED OVER** — still pending; the new c11 `_iso_now` helper (this review's F1) makes the refresh more urgent. |
|
||||
| F3 (consumer-side Protocol cut pattern un-documented in `architecture.md`) | Low / Maintainability | **CARRIED OVER** — pattern now 10+ active instances in production. Codified in `module-layout.md` Rule 9 but architecture.md still lacks a dedicated section. |
|
||||
@@ -13,4 +13,4 @@ retry_count: 0
|
||||
cycle: 1
|
||||
tracker: jira
|
||||
last_completed_batch: 39
|
||||
last_cumulative_review: batches_34-36
|
||||
last_cumulative_review: batches_37-39
|
||||
|
||||
Reference in New Issue
Block a user