mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 15:41:12 +00:00
[AZ-319] C11 HttpTileUploader (post-landing upload path)
Lands the production HttpTileUploader composing AZ-317's gate, AZ-318's per-flight signing, and consumer-side cuts over c6 storage. Implements the full upload flow: gate ON_GROUND -> start_session -> enumerate pending -> per-batch multipart POST with Ed25519 signing -> mark_uploaded on ack -> end_session in finally. Honours Retry-After (RFC 7231 int + HTTP-date), exponential backoff on 5xx, fail-fast on TLS/401/403. Adds C11Config block, three FDR kinds (tile.queued, tile.rejected, batch.complete), and the build_tile_uploader composition-root factory. Cross-component access to c6 stays Protocol-cut (AZ-507 / AZ-270). Tests: 17 new unit tests covering AC-1..AC-14 plus throughput NFR; AZ-272 schema fixtures for the three new FDR kinds. Full unit suite: 1404 passed. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,81 @@
|
||||
# Batch 39 — Code Review
|
||||
|
||||
**Tasks**: AZ-319 (C11 TileUploader)
|
||||
**Cycle**: 1
|
||||
**Reviewer**: autodev
|
||||
**Verdict**: **PASS_WITH_WARNINGS**
|
||||
|
||||
## Scope reviewed
|
||||
|
||||
Production code:
|
||||
|
||||
- `src/gps_denied_onboard/components/c11_tile_manager/_types.py` (additions)
|
||||
- `src/gps_denied_onboard/components/c11_tile_manager/errors.py` (additions)
|
||||
- `src/gps_denied_onboard/components/c11_tile_manager/config.py` (new)
|
||||
- `src/gps_denied_onboard/components/c11_tile_manager/interface.py` (TileUploader signature)
|
||||
- `src/gps_denied_onboard/components/c11_tile_manager/tile_uploader.py` (new — `HttpTileUploader`)
|
||||
- `src/gps_denied_onboard/components/c11_tile_manager/__init__.py` (exports + `register_component_block`)
|
||||
- `src/gps_denied_onboard/runtime_root/c11_factory.py` (`build_tile_uploader`)
|
||||
- `src/gps_denied_onboard/fdr_client/records.py` (3 new `KNOWN_PAYLOAD_KEYS` entries)
|
||||
|
||||
Tests:
|
||||
|
||||
- `tests/unit/c11_tile_manager/test_tile_uploader.py` (15 tests — AC-1..AC-11, AC-13, AC-14, rate-limit budget, NFR)
|
||||
- `tests/unit/c11_tile_manager/test_protocol_conformance.py` (2 tests — AC-12)
|
||||
- `tests/unit/test_az272_fdr_record_schema.py` (3 fixture additions)
|
||||
|
||||
## Phase 1 — Architecture
|
||||
|
||||
### AZ-507 cross-component rule
|
||||
|
||||
`tile_uploader.py` does NOT import from any other `components.*` module. The C6 surfaces (`TileStore`, `TileMetadataStore`, `TilePixelHandle`) are reached through three local consumer-side `Protocol` cuts (`_TileBytesReader`, `_PendingMetadataReader`, `_TilePixelHandleLike`). Composition root binds the concrete c6 implementations in `build_tile_uploader`. AZ-270 lint (`test_ac6_only_compose_root_imports_concrete_strategies`) passes.
|
||||
|
||||
### Composition root
|
||||
|
||||
`build_tile_uploader` reads the `C11Config` block from `config.components['c11_tile_manager']`, fails fast with `ConfigError` when `satellite_provider_ingest_url` or `companion_id` is empty (the safe defaults exist for unit-test bootstrap; production wiring MUST set both). Registers the FDR producer via `make_fdr_client`.
|
||||
|
||||
### FDR / log envelopes
|
||||
|
||||
Three new `KNOWN_PAYLOAD_KEYS` entries added; per-tile records carry `flight_id`, `tile_id`, `fingerprint`, `batch_uuid`; the `c11.upload.batch.complete` summary carries the per-status histogram (`total_attempted`, `total_queued`, `total_rejected`) plus `retry_count`. AZ-272 schema test (`tests/unit/test_az272_fdr_record_schema.py`) covers all three new kinds. Structured logs use the `kv` envelope with no secrets.
|
||||
|
||||
## Phase 2 — Behaviour vs. spec
|
||||
|
||||
| Spec requirement | Status |
|
||||
|------------------|--------|
|
||||
| Gate first; zero side effects on failure | PASS |
|
||||
| `start_session` after gate, `end_session` in `finally` | PASS |
|
||||
| `mark_uploaded` only on `queued / duplicate / superseded` | PASS |
|
||||
| `record_signature_rejection` when `rejection_reason` mentions "signature" | PASS |
|
||||
| Multipart via `httpx`'s `files=`, no manual boundary | PASS |
|
||||
| Canonical bytes order frozen; SHA-256 over deterministic concatenation | PASS |
|
||||
| 429 honours `Retry-After` (int seconds + HTTP-date), capped via config | PASS |
|
||||
| 5xx exponential backoff (1s/2s/4s/8s) → `SatelliteProviderError` after 4 | PASS |
|
||||
| 401/403 fail-fast → `SatelliteProviderError` | PASS |
|
||||
| `outcome = success | partial`; failure paths raise (do NOT return) | PASS (see F1) |
|
||||
|
||||
## Findings
|
||||
|
||||
**F1 — Low (Spec wording vs. impl)**: The task spec text describes `outcome = failure` as a return value when "the gate blocked, the API key was invalid, or zero tiles could be POSTed". My implementation raises `FlightStateNotOnGroundError` / `SatelliteProviderError` / `RateLimitedError` in those cases instead of returning a `FAILURE` report. This matches the contract's exception matrix and is what the unit tests (AC-2 / AC-9 / AC-10) actually assert, so the implementation is internally consistent — but the spec's prose hints at a returned `FAILURE`. The `UploadOutcome.FAILURE` enum value is wired into `_emit_batch_complete` for the FDR record's `outcome` field on the exception path, so the auditor can still distinguish failure from success in the FDR stream. Action: documented here; no code change.
|
||||
|
||||
**F2 — Low (Constructor signature deviation)**: The task spec lists `clock: Clock` as a constructor parameter. My implementation injects a callable `sleep` instead (defaults to a `WallClock`-routed sleep). Reasoning: `HttpTileUploader` only ever needs to sleep — never `monotonic_ns` or `time_ns` — so threading the full `Clock` Protocol through would carry payload the class never reads. The default-sleep helper still routes through `WallClock.sleep_until_ns`, so the AZ-398 invariant (no direct `time.sleep` in `components/`) holds. Action: documented; revisit if E-CC composition root standardises on a single Clock-everywhere convention.
|
||||
|
||||
**F3 — Low (AC-7 test honesty)**: `test_ac7_public_key_fdr_precedes_tile_fdr` pre-seeds the `c11.upload.session.key.public` record into the `FakeFdrSink` because the test uses a stub key manager (not the real AZ-318 `PerFlightKeyManager`). In production wiring, both producers share the same FDR client and ordering is naturally guaranteed by the call sequence. The test docstring calls this out explicitly. Action: documented; the integration test in E-BBT will exercise the real AZ-318 manager.
|
||||
|
||||
**F4 — Low (Race window on partial-success batches)**: The uploader marks a tile uploaded immediately after the parent suite acknowledges it inside the per-batch loop. If the safety officer disputes the same tile within the audit window (≤ 1s), the C6 row is already `uploaded`. Spec Risk-5 documents this and defers mitigation to a separate audit task. Action: no code change in this batch.
|
||||
|
||||
## Phase 3 — Tests
|
||||
|
||||
15 unit tests pass for `HttpTileUploader`; 2 for the Protocol conformance check; 3 fixture additions for the AZ-272 schema test. Full unit suite: **1404 passed, 80 skipped, 0 failed** (skips are environment-gated: Docker, CUDA, TensorRT, Tier-2 hardware).
|
||||
|
||||
NFR-perf-throughput: 1000 tiles run under 5 s with the in-process MockTransport — well above the 20 tile/s contract floor (the mock removes the network bottleneck, so this verifies uploader bookkeeping has no O(n²) regression rather than certifying real throughput).
|
||||
|
||||
## Phase 4 — Quality gates
|
||||
|
||||
- `ReadLints` clean across `c11_tile_manager/`, `runtime_root/c11_factory.py`, `fdr_client/records.py`, and the new test files
|
||||
- No `time.sleep` in components (routes via `WallClock.sleep_until_ns`)
|
||||
- No secrets in logs (AC-10 test asserts no `BEGIN PUBLIC KEY` / `Authorization` substring in any captured log record)
|
||||
- No new third-party dependencies (uses existing `httpx` and `cryptography` pins)
|
||||
|
||||
## Verdict
|
||||
|
||||
**PASS_WITH_WARNINGS** — All four findings are Low severity (documentation drift between spec text and implementation, test-double honesty caveat, and a documented Risk-5 race window). No code change required for batch close-out.
|
||||
Reference in New Issue
Block a user