Closes out greenfield Step 6 (Decompose) for all 14 components (C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446 plus the _dependencies_table.md and component contract documents. State file updated to greenfield Step 7 (Implement), not_started. Co-authored-by: Cursor <cursoragent@cursor.com>
20 KiB
C11 TileUploader — Read Pending + Sign + POST + Mark Uploaded
Task: AZ-319_c11_tile_uploader
Name: C11 TileUploader
Description: Implement the TileUploader Protocol — C11's operator-side post-landing upload path. upload_pending_tiles calls AZ-317's FlightStateGate.confirm_on_ground() first, starts an AZ-318 signing session, reads pending mid-flight tiles from C6 (source = onboard_ingest, voting_status = pending) via the AZ-303 metadata store, packages each tile per the D-PROJ-2 multipart contract sketch (tile_blob, geo metadata, capture_timestamp, flight_id, companion_id, quality_metadata, signature), signs each payload, POSTs to /api/satellite/tiles/ingest, parses the per-tile response, and marks acknowledged tiles uploaded in C6. Honours Retry-After on 429s; fails fast on TLS / auth; surfaces signature_rejected per tile via FDR. The signing key is zeroised in a try/finally guarantee. Idempotent-retry across partial-success batches is a separate decorator task in this epic.
Complexity: 5 points
Dependencies: AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module, AZ-273_fdr_client_ringbuf, AZ-303_c6_storage_interfaces, AZ-305_c6_postgres_filesystem_store, AZ-317_c11_flight_state_gate, AZ-318_c11_signing_key
Component: c11_tilemanager (epic AZ-251 / E-C11)
Tracker: AZ-319
Epic: AZ-251 (E-C11)
Document Dependencies
_docs/02_document/contracts/c11_tilemanager/tile_uploader.md— produced by this task (frozen Protocol + DTO shape, invariants, test cases)._docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md— consumed:pending_uploads,mark_uploaded,get_by_id._docs/02_document/contracts/c6_tile_cache/tile_store.md— consumed:read_tile_pixelsfor the multipart blob._docs/02_document/contracts/shared_logging/log_record_schema.md— INFO/WARN/ERROR log shapes for upload events._docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md—kind="c11.upload.tile.queued"/kind="c11.upload.tile.rejected"/kind="c11.upload.batch.complete"envelopes._docs/02_document/components/12_c11_tilemanager/description.md— § 3.2 D-PROJ-2 contract sketch, § 5 error handling._docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md— D-PROJ-2 design task #1 ingest endpoint shape.
Problem
Without a real TileUploader:
- AC-8.4 (post-landing upload of mid-flight tiles to the parent suite) collapses — the pending-upload journal grows unboundedly across flights.
- D-PROJ-2's safety-officer correlation cannot work — the public-key + tile-id linkage exists only at upload time.
- The AC-NEW-7 voting / trust layer (parent-suite side) has no inputs — without uploads, no flights ever vote.
- Mid-flight tile generation (E-C13 mid-flight tile snapshot, AZ-294) becomes a leaf system: tiles land in C6 with
voting_status = pendingand stay there forever. SignatureRejectedErrorfrom the parent suite has no detection path; a key compromise would not surface to the safety officer until manual log inspection.- Operators have no observable post-landing operation; the F10 functional flow has no implementation.
This task delivers the production uploader. It composes AZ-317 (gate) + AZ-318 (signing) + AZ-303/305 (C6) + httpx; it adds no new responsibilities beyond orchestration, so the surface area is tight.
Outcome
- A
TileUploaderProtocol + concreteHttpTileUploaderclass atsrc/gps_denied_onboard/components/c11_tilemanager/:interface.pyexposesTileUploaderProtocol (runtime_checkable).tile_uploader.pyhousesHttpTileUploader._types.pyaddsUploadRequest,UploadBatchReport,PerTileStatus,UploadOutcome(StrEnum),IngestStatus(StrEnum) — all@dataclass(frozen=True)for the data DTOs.errors.pyaddsSignatureRejectedError(subclassesTileManagerError);FlightStateNotOnGroundErrorand the rest are already declared in AZ-317/AZ-318/AZ-316.
- Constructor signature:
__init__(self, *, http_client: httpx.Client, tile_store: TileStore, tile_metadata_store: TileMetadataStore, flight_state_gate: FlightStateGate, key_manager: PerFlightKeyManager, fdr_client: FdrClient, logger: Logger, clock: Clock, config: C11Config). Injected dependencies — no module-level singletons. upload_pending_tiles(request)flow:- Calls
flight_state_gate.confirm_on_ground()(raises if not ON_GROUND; ZERO state-mutation prior to this). - Calls
key_manager.start_session(flight_id_for_session)—flight_id_for_sessionisrequest.flight_idif provided elseuuid.uuid4()("session id" for the multi-flight case). - In a
tryblock:- Calls
tile_metadata_store.pending_uploads(flight_id=request.flight_id)to enumerate pending tiles. - If empty → returns
UploadBatchReport(outcome=success, per_tile_status=(), batch_uuid=uuid4()). - Splits the pending list into batches of
request.batch_size. - For each batch:
- Reads each tile's pixel bytes via
tile_store.read_tile_pixels(tile_id). - Builds the multipart payload per tile:
tile_blob,zoomLevel,latitude,longitude,tile_size_meters,tile_size_pixels,capture_timestamp,flight_id,companion_id,quality_metadata(JSON),signature(key_manager.sign(canonical_payload_bytes)). - Canonical payload bytes for signing: SHA-256 of
tile_blob || zoomLevel || latitude || longitude || capture_timestamp || flight_id || companion_id || quality_metadata_json(deterministic byte concatenation; documented). - POSTs the multipart to
{config.satellite_provider_url}/api/satellite/tiles/ingest. - On 202: parses
batch_uuid+per_tile_status[]from the response body. For eachqueued | duplicate | supersededtile, callstile_metadata_store.mark_uploaded(tile_id, batch_uuid). For eachrejectedtile, callskey_manager.record_signature_rejection(flight_id, tile_id)if the rejection reason mentions signature; emits FDRkind="c11.upload.tile.rejected"with the reason regardless. - On 429: honours
Retry-After; on persistent 429 →RateLimitedError. - On 5xx: exponential backoff (1s, 2s, 4s; 4 retries max); persistent →
SatelliteProviderError. - On TLS / 401 / 403: fail fast →
SatelliteProviderError.
- Reads each tile's pixel bytes via
- Aggregates
UploadBatchReport:outcome = successif ALL tiles arequeued | duplicate | superseded.outcome = partialif anyrejectedOR any unparseable response with otherwise-acked tiles.outcome = failureif the gate blocked, the API key was invalid, or zero tiles could be POSTed.public_key_fingerprint= the AZ-318 fingerprint fromstart_session.batch_uuid= the LAST successful batch's UUID (oruuid4()if none succeeded; documented).
- Calls
- In a
finallyblock:- Calls
key_manager.end_session()— guaranteed zeroisation regardless of success / failure / exception. - Emits FDR
kind="c11.upload.batch.complete"with{flight_id_for_session, public_key_fingerprint, total_attempted, total_queued, total_rejected, outcome, observed_at_iso}.
- Calls
- Calls
enumerate_pending_tiles(flight_id)returnstile_metadata_store.pending_uploads(flight_id)directly (read-only enumeration).confirm_flight_state()returnsflight_state_gate.confirm_on_ground()(passes through; raises on non-ON_GROUND).- INFO log on session start/end with batch counts; WARN log per retry; ERROR log on
SatelliteProviderError,FlightStateNotOnGroundError(caught and re-raised after log). - Composition root constructs
HttpTileUploaderviabuild_tile_uploader(config) -> TileUploaderatsrc/gps_denied_onboard/runtime_root/c11_factory.py. - Configuration extension to AZ-269 loader:
config.c11.satellite_provider_ingest_url,config.c11.upload_batch_size,config.c11.upload_http_timeout_s,config.c11.companion_id. - Type-only conformance test verifies
isinstance(HttpTileUploader(...), TileUploader).
Scope
Included
TileUploaderProtocol declaration +HttpTileUploaderconcrete class.UploadRequest,UploadBatchReport,PerTileStatus,UploadOutcome,IngestStatusDTOs.SignatureRejectedErrordefinition (parent ofTileManagerError).- The orchestration: gate → start_session → enumerate → batch loop → mark_uploaded / FDR alert → end_session.
- Multipart payload construction + canonical bytes for signing.
- HTTP retry / backoff /
Retry-Afterhandling for the upload path. - Composition-root factory
build_tile_uploader. - Config schema extension for the C11 upload fields.
- Conformance test at
tests/unit/c11_tilemanager/test_protocol_conformance.py.
Excluded
- The
TileDownloaderProtocol and concrete impl — separate task (AZ-316). FlightStateGateimpl — owned by AZ-317.PerFlightKeyManagerimpl — owned by AZ-318.- Idempotent-retry-on-partial-success batch decorator — separate task in this epic (AZ-320_c11_idempotent_retry).
- The R02 ADR-004 build-time exclusion — owned by E-BOOT.
- The pre-flight key enrolment workflow at C12 — owned by E-C12.
- The
mock-suite-sat-servicefixture undertests/fixtures/— owned by E-BBT (test infrastructure). - Voting / trust promotion — owned by D-PROJ-2 /
satellite-provider. - E-C8's
FlightStateSourceimpl — owned by E-C8 (AZ-261).
Acceptance Criteria
AC-1: Happy path uploads all pending tiles
Given 50 pending tiles in C6, ON_GROUND, parent suite returns 202 with all queued
When upload_pending_tiles(request) is called
Then 50 POSTs issued (one per tile or batched per batch_size); all 50 marked uploaded in C6 (verifiable via mark_uploaded spy); UploadBatchReport.outcome = success; one FDR kind="c11.upload.batch.complete" with total_attempted=50, total_queued=50
AC-2: Flight-state gate blocks before any read or POST
Given FlightStateGate.confirm_on_ground() raises FlightStateNotOnGroundError(IN_FLIGHT)
When upload_pending_tiles(request) is called
Then FlightStateNotOnGroundError is raised; ZERO calls to pending_uploads (verifiable via spy); ZERO HTTP POSTs; ZERO calls to key_manager.start_session (key generation is also gated); key_manager.end_session() is NOT called (no session was started)
AC-3: Signature rejection per tile is FDR'd and not marked uploaded
Given parent suite returns rejected for 1 tile with reason "invalid signature"
When the response is parsed
Then key_manager.record_signature_rejection(flight_id, tile_id) is called once; tile_metadata_store.mark_uploaded is NOT called for that tile; the tile remains voting_status = pending; FDR kind="c11.upload.tile.rejected" is emitted with the reason; report's outcome = partial
AC-4: duplicate and superseded are treated as success
Given parent suite returns duplicate for 5 tiles and superseded for 3 tiles
When the response is parsed
Then all 8 are mark_uploaded'd in C6 with the batch_uuid; report's per_tile_status reflects the original status; outcome = success if no rejected
AC-5: Signing key is zeroised on success
Given a successful upload
When upload_pending_tiles returns
Then key_manager.end_session() was called once (verifiable via spy); the AZ-318 manager's _private_key is None
AC-6: Signing key is zeroised on failure
Given the FIRST POST raises a connection-reset error
When upload_pending_tiles raises SatelliteProviderError
Then key_manager.end_session() was called (try/finally executed); the manager's _private_key is None; the partial state in C6 is consistent (no half-marked tiles)
AC-7: Public-key FDR record precedes any tile FDR
Given a session with at least one tile
When the FDR stream is captured
Then kind="c11.upload.session.key.public" is observed BEFORE any kind="c11.upload.tile.*" record
AC-8: 429 honours Retry-After
Given parent suite returns 429 with Retry-After: 60 on the first POST
When the uploader processes the response
Then Clock.sleep is called with ≥ 60s; on success the run proceeds; the report includes retry_count >= 1
AC-9: Persistent 5xx aborts with structured error
Given parent suite returns 503 for 5 consecutive attempts
When the uploader exhausts retries
Then SatelliteProviderError is raised; the report is NOT returned (the exception propagates); key_manager.end_session() was called via finally
AC-10: TLS / 401 / 403 fail fast
Given the first POST returns 401
When the uploader processes the response
Then SatelliteProviderError is raised on the first attempt; zero retries; the public key is NOT logged; the API key (if any TLS auth header) is NOT logged
AC-11: Empty pending set is success with no POSTs
Given zero pending tiles in C6
When upload_pending_tiles(request) is called
Then outcome = success; per_tile_status is empty; key_manager.start_session was called (signature still required by D-PROJ-2 for the empty-batch ack record per § 3.2; documented); end_session was called; ONE FDR c11.upload.batch.complete with total_attempted=0
AC-12: Conformance — concrete impl satisfies Protocol
Given an HttpTileUploader instance
When isinstance(impl, TileUploader) is checked under runtime_checkable
Then the result is True; a fake omitting confirm_flight_state returns False
AC-13: Canonical signing bytes are deterministic
Given the same tile metadata + tile bytes
When _canonical_payload_bytes(tile) is computed twice
Then the two byte strings are bitwise identical (no map ordering, no JSON whitespace drift); the SHA-256 over them matches; this is asserted via property test with N random tiles
AC-14: Partial-success batches return without raising
Given a 10-tile batch where 7 are queued, 3 are rejected
When upload_pending_tiles returns
Then NO exception is raised; outcome = partial; per_tile_status has all 10 entries with their respective statuses; the 7 acked tiles are marked uploaded in C6; the 3 rejected stay pending
Non-Functional Requirements
Performance
- Upload throughput ≥ 20 tile/s with signing (C11-PT-02); the bottleneck is the network plus signing per tile.
- Per-tile signing ≤ 200 µs (Ed25519 from AZ-318); per-tile multipart construction ≤ 1 ms.
Compatibility
httpxper project pin;cryptographyper project pin.- Multipart form encoding per
httpx'sfiles=parameter — no manual boundary construction.
Reliability
- Try/finally ensures
key_manager.end_session()runs in EVERY exit path including unexpected exceptions and KeyboardInterrupt. - The uploader writes to C6 ONLY via the AZ-303 Protocol (
mark_uploaded); it does NOT touch the metadata table directly. - Concurrent invocations against the same
cache_rootare gated by C12's filesystem lockfile (same lock as TileDownloader); the uploader asserts the lock at construction.
Unit Tests
| AC Ref | What to Test | Required Outcome |
|---|---|---|
| AC-1 | 50-tile happy path | All mark_uploaded'd; outcome=success; FDR batch.complete present |
| AC-2 | Gate raises before any work | Zero spies fire on pending_uploads, POST, start_session |
| AC-3 | One signature rejection in a 5-tile batch | record_signature_rejection called once; rejected tile NOT marked uploaded; outcome=partial |
| AC-4 | Mix of duplicate and superseded responses |
All marked uploaded; outcome=success |
| AC-5 | Successful upload | end_session called; _private_key is None |
| AC-6 | Mid-batch failure | end_session called; key zeroised |
| AC-7 | FDR stream order | key.public before any tile.* |
| AC-8 | 429 + Retry-After: 60 | Clock.sleep ≥ 60s; retry succeeds |
| AC-9 | 5x 503 | SatelliteProviderError; finally still ran |
| AC-10 | 401 first attempt | Fail-fast; no API-key in any log |
| AC-11 | Empty pending set | outcome=success; zero POSTs; key still session-started/ended |
| AC-12 | isinstance check on impl + partial fake |
True / False |
| AC-13 | Property test: deterministic canonical bytes | Bitwise equal for N samples |
| AC-14 | Partial-success batch | No exception; outcome=partial; per-tile statuses correct |
| NFR-perf-throughput | 1000 tiles via fake httpx | ≥ 20 tile/s including signing |
Constraints
- The signing canonical-bytes scheme is
sha256(tile_blob || zoomLevel || latitude || longitude || capture_timestamp || flight_id || companion_id || quality_metadata_json); the parent suite's D-PROJ-2 ingest endpoint MUST agree on this scheme (the leftover file documents the contract sketch). Any divergence at the parent-suite side surfaces assignature_rejectedand gets FDR-alerted. - The uploader does NOT modify the multipart payload's tile_blob — bytes go from C6 directly into the POST body.
- The order of operations is gate → start_session → enumerate → batch loop → finally end_session. Reordering is a Reliability finding (High).
- Concurrent C11 invocations are blocked by C12's lockfile; this task asserts the lock exists at construction.
- This task introduces no new third-party dependencies beyond
httpxandcryptography(already used in AZ-316 and AZ-318). - The
companion_idfield comes fromconfig.c11.companion_id— not auto-detected, not derived from hostname; documented because the parent suite's voting layer relies on stable per-companion identifiers.
Risks & Mitigation
Risk 1: Parent-suite ingest endpoint not yet implemented (D-PROJ-2)
- Risk: Until
satellite-providerships the POST endpoint, every upload fails with 404. - Mitigation: The e2e-test
mock-suite-sat-servicefixture (undertests/fixtures/, owned by E-BBT) implements the planned POST contract. The C11 unit tests run against a fakehttpx.Client; integration tests run against the mock fixture. Production retire to the real endpoint when it ships; no code change in C11.
Risk 2: Signature canonical-bytes drift between C11 and parent suite
- Risk: A subtle JSON-ordering or float-formatting drift produces signatures that don't verify on the parent side.
- Mitigation: AC-13 property test asserts bitwise determinism on the C11 side; the leftover file documents the canonical scheme; the parent-suite team's Plan cycle will reuse the same scheme. If they diverge,
signature_rejectedsurfaces immediately and the safety officer is alerted.
Risk 3: Retry-After parsing for HTTP-date format
- Risk: The parent suite returns
Retry-After: <date>not<seconds>; naïve parsing crashes. - Mitigation: Same as AZ-316 (TileDownloader Risk 1) — parse both forms; cap wait at
config.c11.max_retry_after_s.
Risk 4: try/finally violation (key not zeroised on KeyboardInterrupt)
- Risk: A
KeyboardInterruptduring the batch loop bypasses the finally if poorly written. - Mitigation: The finally is unconditional (Python's
try/finallyruns forKeyboardInterrupt); a unit test injectsKeyboardInterruptmid-batch and assertsend_sessionran.
Risk 5: Partial-success state inconsistency
- Risk: A tile is marked
uploadedin C6 but the parent suite later disputes (race betweenmark_uploadedand the safety officer's audit). - Mitigation:
mark_uploadedrecords thebatch_uuid(per AZ-303 contract); audits cross-referencebatch_uuid+tile_idagainst the parent suite's ingest log. The race window is ≤ 1 sec (mark happens immediately after the per-tile response is parsed). Documented; not addressed in this task.
Runtime Completeness
- Named capability: post-landing tile upload to D-PROJ-2 ingest endpoint, AC-8.4 enforcement, F10 functional flow, R09 mitigation via per-flight key (composed from AZ-318), parent-suite voting-layer enabler.
- Production code that must exist: real
HttpTileUploaderorchestrating realhttpxPOSTs, real C6mark_uploadedcalls, realtry/finallyzeroisation, real composition-root factory, real config schema extension, real canonical-byte scheme. - Allowed external stubs: tests MAY use a fake
httpx.Client, fakeClock, fake C6 stores (already provided by AZ-303's conformance fakes), fakeFlightStateGateandPerFlightKeyManager(so this task's tests don't drag in AZ-317/AZ-318 internals); production wiring uses real all the way down. - Unacceptable substitutes: skipping the gate (defeats AC-8.4 defence-in-depth); silently retrying signature rejections without FDR (loses safety officer surface); reusing a static signing key (reintroduces R09); marking a tile uploaded before the parent suite acks (data integrity violation); manually building the multipart boundary (
httpx'sfiles=is the right interface).
Contract
This task produces/implements the contract at _docs/02_document/contracts/c11_tilemanager/tile_uploader.md.
Consumers MUST read that file — not this task spec — to discover the interface.