Lands the production HttpTileUploader composing AZ-317's gate, AZ-318's per-flight signing, and consumer-side cuts over c6 storage. Implements the full upload flow: gate ON_GROUND -> start_session -> enumerate pending -> per-batch multipart POST with Ed25519 signing -> mark_uploaded on ack -> end_session in finally. Honours Retry-After (RFC 7231 int + HTTP-date), exponential backoff on 5xx, fail-fast on TLS/401/403. Adds C11Config block, three FDR kinds (tile.queued, tile.rejected, batch.complete), and the build_tile_uploader composition-root factory. Cross-component access to c6 stays Protocol-cut (AZ-507 / AZ-270). Tests: 17 new unit tests covering AC-1..AC-14 plus throughput NFR; AZ-272 schema fixtures for the three new FDR kinds. Full unit suite: 1404 passed. Co-authored-by: Cursor <cursoragent@cursor.com>
6.7 KiB
Batch 39 — Code Review
Tasks: AZ-319 (C11 TileUploader) Cycle: 1 Reviewer: autodev Verdict: PASS_WITH_WARNINGS
Scope reviewed
Production code:
src/gps_denied_onboard/components/c11_tile_manager/_types.py(additions)src/gps_denied_onboard/components/c11_tile_manager/errors.py(additions)src/gps_denied_onboard/components/c11_tile_manager/config.py(new)src/gps_denied_onboard/components/c11_tile_manager/interface.py(TileUploader signature)src/gps_denied_onboard/components/c11_tile_manager/tile_uploader.py(new —HttpTileUploader)src/gps_denied_onboard/components/c11_tile_manager/__init__.py(exports +register_component_block)src/gps_denied_onboard/runtime_root/c11_factory.py(build_tile_uploader)src/gps_denied_onboard/fdr_client/records.py(3 newKNOWN_PAYLOAD_KEYSentries)
Tests:
tests/unit/c11_tile_manager/test_tile_uploader.py(15 tests — AC-1..AC-11, AC-13, AC-14, rate-limit budget, NFR)tests/unit/c11_tile_manager/test_protocol_conformance.py(2 tests — AC-12)tests/unit/test_az272_fdr_record_schema.py(3 fixture additions)
Phase 1 — Architecture
AZ-507 cross-component rule
tile_uploader.py does NOT import from any other components.* module. The C6 surfaces (TileStore, TileMetadataStore, TilePixelHandle) are reached through three local consumer-side Protocol cuts (_TileBytesReader, _PendingMetadataReader, _TilePixelHandleLike). Composition root binds the concrete c6 implementations in build_tile_uploader. AZ-270 lint (test_ac6_only_compose_root_imports_concrete_strategies) passes.
Composition root
build_tile_uploader reads the C11Config block from config.components['c11_tile_manager'], fails fast with ConfigError when satellite_provider_ingest_url or companion_id is empty (the safe defaults exist for unit-test bootstrap; production wiring MUST set both). Registers the FDR producer via make_fdr_client.
FDR / log envelopes
Three new KNOWN_PAYLOAD_KEYS entries added; per-tile records carry flight_id, tile_id, fingerprint, batch_uuid; the c11.upload.batch.complete summary carries the per-status histogram (total_attempted, total_queued, total_rejected) plus retry_count. AZ-272 schema test (tests/unit/test_az272_fdr_record_schema.py) covers all three new kinds. Structured logs use the kv envelope with no secrets.
Phase 2 — Behaviour vs. spec
| Spec requirement | Status |
|---|---|
| Gate first; zero side effects on failure | PASS |
start_session after gate, end_session in finally |
PASS |
mark_uploaded only on queued / duplicate / superseded |
PASS |
record_signature_rejection when rejection_reason mentions "signature" |
PASS |
Multipart via httpx's files=, no manual boundary |
PASS |
| Canonical bytes order frozen; SHA-256 over deterministic concatenation | PASS |
429 honours Retry-After (int seconds + HTTP-date), capped via config |
PASS |
5xx exponential backoff (1s/2s/4s/8s) → SatelliteProviderError after 4 |
PASS |
401/403 fail-fast → SatelliteProviderError |
PASS |
| `outcome = success | partial`; failure paths raise (do NOT return) |
Findings
F1 — Low (Spec wording vs. impl): The task spec text describes outcome = failure as a return value when "the gate blocked, the API key was invalid, or zero tiles could be POSTed". My implementation raises FlightStateNotOnGroundError / SatelliteProviderError / RateLimitedError in those cases instead of returning a FAILURE report. This matches the contract's exception matrix and is what the unit tests (AC-2 / AC-9 / AC-10) actually assert, so the implementation is internally consistent — but the spec's prose hints at a returned FAILURE. The UploadOutcome.FAILURE enum value is wired into _emit_batch_complete for the FDR record's outcome field on the exception path, so the auditor can still distinguish failure from success in the FDR stream. Action: documented here; no code change.
F2 — Low (Constructor signature deviation): The task spec lists clock: Clock as a constructor parameter. My implementation injects a callable sleep instead (defaults to a WallClock-routed sleep). Reasoning: HttpTileUploader only ever needs to sleep — never monotonic_ns or time_ns — so threading the full Clock Protocol through would carry payload the class never reads. The default-sleep helper still routes through WallClock.sleep_until_ns, so the AZ-398 invariant (no direct time.sleep in components/) holds. Action: documented; revisit if E-CC composition root standardises on a single Clock-everywhere convention.
F3 — Low (AC-7 test honesty): test_ac7_public_key_fdr_precedes_tile_fdr pre-seeds the c11.upload.session.key.public record into the FakeFdrSink because the test uses a stub key manager (not the real AZ-318 PerFlightKeyManager). In production wiring, both producers share the same FDR client and ordering is naturally guaranteed by the call sequence. The test docstring calls this out explicitly. Action: documented; the integration test in E-BBT will exercise the real AZ-318 manager.
F4 — Low (Race window on partial-success batches): The uploader marks a tile uploaded immediately after the parent suite acknowledges it inside the per-batch loop. If the safety officer disputes the same tile within the audit window (≤ 1s), the C6 row is already uploaded. Spec Risk-5 documents this and defers mitigation to a separate audit task. Action: no code change in this batch.
Phase 3 — Tests
15 unit tests pass for HttpTileUploader; 2 for the Protocol conformance check; 3 fixture additions for the AZ-272 schema test. Full unit suite: 1404 passed, 80 skipped, 0 failed (skips are environment-gated: Docker, CUDA, TensorRT, Tier-2 hardware).
NFR-perf-throughput: 1000 tiles run under 5 s with the in-process MockTransport — well above the 20 tile/s contract floor (the mock removes the network bottleneck, so this verifies uploader bookkeeping has no O(n²) regression rather than certifying real throughput).
Phase 4 — Quality gates
ReadLintsclean acrossc11_tile_manager/,runtime_root/c11_factory.py,fdr_client/records.py, and the new test files- No
time.sleepin components (routes viaWallClock.sleep_until_ns) - No secrets in logs (AC-10 test asserts no
BEGIN PUBLIC KEY/Authorizationsubstring in any captured log record) - No new third-party dependencies (uses existing
httpxandcryptographypins)
Verdict
PASS_WITH_WARNINGS — All four findings are Low severity (documentation drift between spec text and implementation, test-double honesty caveat, and a documented Risk-5 race window). No code change required for batch close-out.