# C11 TileDownloader — GET + Resolution Gate + C6 Write **Task**: AZ-316_c11_tile_downloader **Name**: C11 TileDownloader **Description**: Implement the `TileDownloader` Protocol — C11's operator-side download path. `download_tiles_for_area` issues authenticated `httpx` GETs against `satellite-provider`, enforces RESTRICT-SAT-4 (≥ 0.5 m/px) at the C11 boundary, writes accepted tiles via the AZ-303 `TileStore` + `TileMetadataStore` Protocols (which run AZ-307's freshness gate at write), pre-checks cache headroom against AZ-308's budget enforcer, and journals download progress for idempotent re-runs. `enumerate_remote_coverage` is the read-only enumeration helper used by C12 for pre-flight area sizing. Honours `Retry-After` on 429s, fails fast on TLS / auth errors, retries with backoff on 5xx. Surfaces freshness, resolution, downgrade, and outcome counts in `DownloadBatchReport`. **Complexity**: 5 points **Dependencies**: AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module, AZ-303_c6_storage_interfaces, AZ-305_c6_postgres_filesystem_store, AZ-307_c6_freshness_gate, AZ-308_c6_cache_budget_eviction **Component**: c11_tilemanager (epic AZ-251 / E-C11) **Tracker**: AZ-316 **Epic**: AZ-251 (E-C11) ### Document Dependencies - `_docs/02_document/contracts/c11_tilemanager/tile_downloader.md` — produced by this task (frozen Protocol + DTO shape, invariants, test cases). - `_docs/02_document/contracts/c6_tile_cache/tile_store.md` — consumed: `write_tile`, `tile_exists` are the write-side endpoints used by this task. - `_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md` — consumed: `insert_metadata`, `query_by_bbox` for idempotence. - `_docs/02_document/contracts/shared_logging/log_record_schema.md` — INFO/WARN/ERROR log shapes for download events. - `_docs/02_document/components/12_c11_tilemanager/description.md` — § 3.1 download API, § 5 error handling, § 7 caveats (R02 enforcement). ## Problem Without a real `TileDownloader`: - F1 pre-flight cache build cannot start (C10 has no source; C12 has no operation to invoke). - AC-8.1 (imagery via Suite Sat Service offline cache) and AC-8.3 (pre-loaded onto companion before flight) collapse — both require C11 populating C6 first. - RESTRICT-SAT-4 (≥ 0.5 m/px) has no enforcement point — sub-spec tiles would silently land in C6 and downstream pose estimation would degrade. - AC-NEW-6 (system rejects/downgrades stale tiles) is unenforced for the download path; C6's gate (AZ-307) catches it but the operator has no visibility into how many were rejected per batch. - A re-run of an already-completed download would re-issue every GET against `satellite-provider`, wasting bandwidth and fighting the operator workflow which expects idempotence (D-C10-1 manifest hash check downstream relies on stable downloads). - Network failures (TLS, auth, 5xx, 429) without a structured handler would either silently drop tiles or crash the operator-tooling CLI mid-batch. This task delivers the production downloader. It writes no C6 storage logic (AZ-303/305 own that), no freshness rule (AZ-307 owns), and no eviction (AZ-308 owns) — it composes them. ## Outcome - A `TileDownloader` Protocol + concrete `HttpTileDownloader` class at `src/gps_denied_onboard/components/c11_tilemanager/`: - `interface.py` exposes `TileDownloader` Protocol (`runtime_checkable`). - `tile_downloader.py` houses `HttpTileDownloader`. - `_types.py` houses `DownloadRequest`, `DownloadBatchReport`, `TileSummary`, `DownloadOutcome` (all `@dataclass(frozen=True)` for the data DTOs; `DownloadOutcome` is a `StrEnum`). - `errors.py` houses `SatelliteProviderError`, `RateLimitedError`, `ResolutionRejectionError`, `CacheBudgetExceededError`, `TileManagerError` (family parent). - Constructor signature: `__init__(self, *, http_client: httpx.Client, tile_store: TileStore, tile_metadata_store: TileMetadataStore, budget_enforcer: CacheBudgetEnforcer, logger: Logger, clock: Clock)`. Injected dependencies — no module-level singletons. - `download_tiles_for_area(request)` flow: 1. Validates the `DownloadRequest` (bbox ordering, zoom levels in `[0, 21]`, `cache_root` is writable). 2. Looks up `cache_root/.c11/journal/__.json`. If present and complete → `outcome = idempotent_no_op`; return early with zero counts. 3. Calls `enumerate_remote_coverage(bbox, zoom_levels)` to size the batch. 4. Calls `budget_enforcer.reserve_headroom(estimated_bytes)`. If insufficient → `CacheBudgetExceededError`; `outcome = failure`. 5. For each tile in the batch, fetches the `httpx` GET response (per-tile streaming to bound memory): - Auth: TLS + `Authorization: Bearer `. - On 429 → honour `Retry-After` (sleep then retry once; second 429 → `RateLimitedError`). - On 5xx → exponential backoff with cap (1s, 2s, 4s; 4 retries max); persistent 5xx → `SatelliteProviderError`. - On TLS / 401 / 403 → fail fast → `SatelliteProviderError`. 6. Resolution gate: read `resolution_m_per_px` from response headers / metadata; if `< 0.5` → increment `tiles_rejected_resolution`; do NOT write to C6. 7. Calls `tile_store.write_tile(...)` and `tile_metadata_store.insert_metadata(...)`. C6's freshness gate (AZ-307) may raise `FreshnessRejectionError` here — catch, increment `tiles_rejected_freshness`, continue. 8. On `FreshnessLabel.DOWNGRADED` (returned by C6 when stable_rear-stale) → increment `tiles_downgraded`. 9. After every successful tile write, append the tile id to the journal (`fsync` per atomicwrites pattern from description.md). 10. On batch completion: write the journal's terminal record, return `DownloadBatchReport(outcome=success)`. - `enumerate_remote_coverage(bbox, zoom_levels)` issues a `GET /api/satellite/tiles?bbox=...&zoom=...&list-only=true` (matches the existing parent-suite GET surface; the response carries the per-tile `produced_at` + `resolution_m_per_px` + `estimated_bytes`); returns `list[TileSummary]`. - INFO log: `kind="c11.download.session.start"` / `kind="c11.download.session.end"` with batch counts. - WARN log: per retry, per freshness rejection batch, per resolution rejection. - ERROR log: persistent `SatelliteProviderError`, `CacheBudgetExceededError`, TLS/auth failures. - The composition root constructs `HttpTileDownloader` via `build_tile_downloader(config) -> TileDownloader` at `src/gps_denied_onboard/runtime_root/c11_factory.py`. - Configuration extension to AZ-269 loader: `config.c11.satellite_provider_url`, `config.c11.service_api_key`, `config.c11.cache_root`, `config.c11.http_timeout_s`, `config.c11.max_5xx_retries`. - Type-only conformance test verifies `isinstance(HttpTileDownloader(...), TileDownloader)` via `runtime_checkable`. - The contract file at `_docs/02_document/contracts/c11_tilemanager/tile_downloader.md` is frozen as part of this task; consumers (C12) read it, not this spec. ## Scope ### Included - `TileDownloader` Protocol declaration + `HttpTileDownloader` concrete class. - All four DTOs (`DownloadRequest`, `DownloadBatchReport`, `TileSummary`) and the `DownloadOutcome` + `FreshnessLabel`-consumption enums. - The download-progress journal under `cache_root/.c11/journal/` (atomicwrites + fsync; one file per `(flight_id, request_hash)` pair). - The resolution gate at C11 boundary (RESTRICT-SAT-4). - The cache-headroom pre-check via AZ-308's `CacheBudgetEnforcer.reserve_headroom`. - HTTP retry / backoff / `Retry-After` handling. - Composition-root factory `build_tile_downloader`. - Config schema extension for the C11 download fields. - Conformance test at `tests/unit/c11_tilemanager/test_protocol_conformance.py`. ### Excluded - The `TileUploader` Protocol and concrete impl — separate task in this epic. - The per-flight signing key — separate task; downloads use a static service-internal API key, NOT the per-flight key. - The flight-state gate — download has no `flight_state` precondition (it runs operator-side, never airborne). The gate task applies to upload only. - Idempotent-retry-on-partial-success — that's the upload concern; download idempotence is handled here via the journal. - The `mock-suite-sat-service` fixture — the e2e-test fixture in `tests/fixtures/` is for the upload path. Download tests run against the REAL `satellite-provider` GET surface (or its existing test Docker fixture). - C12 CLI plumbing — owned by E-C12. - Sector boundary CRUD (C12) — the `DownloadRequest.sector_class` is supplied per request; C11 does NOT look up sector boundaries. - The R02 ADR-004 build-time exclusion of `c11_tilemanager` from the airborne image — owned by E-BOOT (build-system task). ## Acceptance Criteria **AC-1: Downloader writes accepted tiles to C6** Given a `DownloadRequest` for a Derkachi-bbox with 100 tiles all fresh and ≥ 0.5 m/px When `download_tiles_for_area` is called Then `DownloadBatchReport.tiles_downloaded == 100`; every tile is present in `TileStore` (verifiable via `tile_exists`); every tile has a `TileMetadata` row; `outcome = success` **AC-2: Resolution-gate rejects sub-spec tiles at C11 boundary** Given a batch of 50 tiles where 10 have `resolution_m_per_px = 0.3` When `download_tiles_for_area` is called Then `tiles_rejected_resolution == 10`; the 10 are NOT in `TileStore`; no `tile_store.write_tile` was attempted for them (verifiable via spy); ONE `c11.download.resolution_rejected` WARN log per rejected tile **AC-3: Freshness rejections from C6 are counted, not propagated** Given C6's AZ-307 raises `FreshnessRejectionError` for 5 tiles in an active_conflict batch When `download_tiles_for_area` is called Then `tiles_rejected_freshness == 5`; the run continues for the remaining tiles; no exception escapes to the caller; ONE summary WARN log at session end **AC-4: Stable_rear stale tiles are surfaced as downgraded** Given C6 returns `FreshnessLabel.DOWNGRADED` for 3 tiles (stable_rear stale) When `download_tiles_for_area` is called Then `tiles_downgraded == 3`; those tiles ARE present in `TileStore` with the `DOWNGRADED` label; no exception is raised **AC-5: 429 honours Retry-After** Given `satellite-provider` returns 429 with `Retry-After: 30` When the downloader hits the 429 Then the downloader sleeps ≥ 30s before the retry (verifiable via injected `Clock.sleep` capture); on success the run proceeds normally; no `RateLimitedError` is raised **AC-6: Persistent 5xx aborts with structured error** Given `satellite-provider` returns 503 for 5 consecutive attempts (exceeds the retry budget) When the downloader exhausts retries Then `SatelliteProviderError` is raised with the response body, the URL, and the attempt count; `outcome = failure`; partial writes are journaled (the run is resumable on next call) **AC-7: TLS / 401 / 403 fail fast (no retry)** Given the first GET returns 401 When the downloader processes the response Then `SatelliteProviderError` is raised on the FIRST attempt (zero retries); no plaintext fallback is attempted; the API key is NOT logged (security) **AC-8: Idempotent re-run after success** Given a successful prior `download_tiles_for_area(R)` whose journal recorded all 100 tiles When `download_tiles_for_area(R)` is called again Then `outcome = idempotent_no_op`; zero GETs observed; zero `tile_store.write_tile` calls; the report's counts are zero except `tiles_downloaded` which equals the journaled count (transparency for the caller) **AC-9: Cache-budget pre-check aborts before any write** Given `budget_enforcer.reserve_headroom(estimated_bytes)` returns `EvictionResult.insufficient` When `download_tiles_for_area` is called Then `CacheBudgetExceededError` is raised with the budget delta; zero GETs are issued; zero writes attempted; ONE ERROR log **AC-10: Conformance — concrete impl satisfies Protocol** Given a `HttpTileDownloader` instance When `isinstance(impl, TileDownloader)` is checked under `runtime_checkable` Then the result is `True`; a fake omitting `enumerate_remote_coverage` returns `False` **AC-11: Service API key never logged** Given any log path (INFO, WARN, ERROR) When the downloader logs the request URL or headers Then the `service_api_key` value is redacted (`Bearer ***`); no test log capture observes the raw key **AC-12: Journal survives mid-batch crash** Given the process is killed after 30 of 100 tiles have been written and journaled When `download_tiles_for_area(R)` is called on restart Then the journal is read; the 30 already-journaled tiles are skipped; only the remaining 70 are GET-fetched; final report shows `tiles_downloaded = 100` (30 prior + 70 new); `outcome = success` ## Non-Functional Requirements **Performance** - Download throughput ≥ 50 MB/s on a 1 Gbps link (C11-PT-01); the bottleneck is the network, not the writer. - Per-tile resolution-gate check ≤ 50 µs (header parse + comparison). - Journal append ≤ 5 ms per tile (atomicwrites + fsync; bounded by disk). **Compatibility** - `httpx` per project pin (verify before adding); no per-task version bump unless absolutely required. - `atomicwrites` for journal — already used elsewhere (per description.md § 5). **Reliability** - The downloader MUST tolerate process kill at any point and recover via journal on restart (AC-12). - The downloader MUST NOT corrupt C6 — the AZ-303 Protocol guarantees atomic writes; this task does not add new fs operations. - The injected `Clock` MUST be the same singleton used by AZ-308's enforcer and AZ-307's gate. ## Unit Tests | AC Ref | What to Test | Required Outcome | |--------|-------------|-----------------| | AC-1 | 100-tile happy path against fake `httpx.Client` | All in C6; `tiles_downloaded=100`; `outcome=success` | | AC-2 | Mixed batch, 10 sub-spec tiles | `tiles_rejected_resolution=10`; spy on `write_tile` shows 40 calls | | AC-3 | C6 raises `FreshnessRejectionError` 5x | `tiles_rejected_freshness=5`; run completes | | AC-4 | C6 returns DOWNGRADED for 3 | `tiles_downgraded=3`; tiles present with DOWNGRADED label | | AC-5 | 429 + `Retry-After: 30` | `Clock.sleep` captured ≥ 30s; success path resumes | | AC-6 | 5x 503 | `SatelliteProviderError` with attempt count | | AC-7 | 401 first attempt | `SatelliteProviderError` on first call; zero retries | | AC-8 | Re-run identical request post-success | `outcome=idempotent_no_op`; zero GETs | | AC-9 | `reserve_headroom` returns insufficient | `CacheBudgetExceededError`; zero GETs | | AC-10 | `isinstance` check on impl + on partial fake | True / False | | AC-11 | Log capture during full session | Header values redacted; no raw key in any record | | AC-12 | Kill mid-batch + restart | 30 prior journaled + 70 new = 100 final | | NFR-perf-throughput | 10 GB synthetic batch over loopback | ≥ 50 MB/s | ## Constraints - The `service_api_key` is logged ONLY redacted (`Bearer ***`); the raw value MUST never appear in INFO/WARN/ERROR logs (security finding High in code-review otherwise). - The downloader runs operator-side ONLY; the airborne build target excludes the entire `c11_tilemanager/` source tree (E-BOOT enforces; this task does NOT add the exclusion but its tests assert via R02 that the module's import surface is honest). - The journal location is `cache_root/.c11/journal/__.json` — `request_hash` is `sha256(bbox|zoom_levels|sector_class|service_api_key_hash).hex()[:16]`. - The hot path uses `httpx.Client` (sync), not `httpx.AsyncClient` — the operator workflow is offline minutes; async adds complexity without a measurable win. - Per-tile streaming is mandatory; loading an entire batch into memory is rejected (a 10 GB Derkachi area would OOM on operator workstations). - This task introduces no new runtime dependencies beyond `httpx` and `atomicwrites` (verify both are in the project's `requirements.txt`). - The R02 ADR-004 exclusion is enforced at build time by E-BOOT; this task adds no runtime self-check (security tasks C11-ST-01/02/03 cover that in Step 9). ## Risks & Mitigation **Risk 1: `Retry-After` header format ambiguity (HTTP-date vs. seconds)** - *Risk*: Some servers send `Retry-After: Wed, 21 Oct 2026 07:28:00 GMT`; naïve integer parsing crashes. - *Mitigation*: Parse both forms (`int` first, `email.utils.parsedate_to_datetime` fallback); cap the wait at `config.c11.max_retry_after_s` (default 300s) to prevent server-pinned hangs. **Risk 2: API key leakage via exception messages** - *Risk*: An unhandled exception's repr includes the `httpx.Request` whose headers carry the raw `service_api_key`. - *Mitigation*: Wrap every `httpx` call in try/except → re-raise `SatelliteProviderError` with sanitised message; never include `httpx.Request.headers` in the error payload. **Risk 3: Journal corruption on power-loss** - *Risk*: Process killed mid-`fsync` leaves a torn JSON line; restart fails to parse. - *Mitigation*: `atomicwrites` provides write-then-rename semantics; one journal record per file segment with a checksum; corrupt records are skipped on read with a WARN log and the affected tile is re-fetched. Documented behaviour. **Risk 4: `satellite-provider`'s GET surface returns tiles in a format C11 doesn't expect** - *Risk*: The parent suite changes the response schema (new field, renamed metadata key); C11 silently misparses. - *Mitigation*: Strict pydantic-free DTO parsing — extra fields ignored, required fields → `SatelliteProviderError("response schema mismatch", field=...)`. Versioning for the GET API is owned by `satellite-provider`'s contract; this task pins to the current shape. **Risk 5: Concurrent C11 invocations corrupt the journal** - *Risk*: Operator launches two `download_tiles_for_area` runs against the same `cache_root` simultaneously; both write to the same journal. - *Mitigation*: Per `description.md` § 7, C12 owns a filesystem lockfile that gates concurrent invocations. This task asserts the lockfile exists at construction (`cache_root/.c11/lock`) and refuses to start otherwise. The lockfile creation is C12's job. ## Runtime Completeness - **Named capability**: operator-side download from `satellite-provider` GET surface, RESTRICT-SAT-4 enforcement, AC-NEW-6 freshness counting, AC-8.1/8.3 cache provisioning input. - **Production code that must exist**: real `httpx.Client`-backed `HttpTileDownloader`, real journal write/read with atomicwrites, real resolution gate, real composition-root factory wiring AZ-303/305/307/308 dependencies, real config schema extension. - **Allowed external stubs**: tests MAY use a fake `httpx.Client` (transport-level) to script responses, fake `Clock` for deterministic retries, fake `TileStore`/`TileMetadataStore` (already provided by AZ-303's conformance fakes); production wiring uses real httpx + real Postgres-backed C6. - **Unacceptable substitutes**: a hardcoded "every tile passes" resolution gate (defeats RESTRICT-SAT-4); skipping the journal and re-issuing every GET (defeats I-2 idempotence); silently catching `FreshnessRejectionError` without counting (loses operator visibility for AC-NEW-6); using `requests` instead of `httpx` (project pinned to httpx; switching mid-component is an `Architecture` finding). ## Contract This task produces/implements the contract at `_docs/02_document/contracts/c11_tilemanager/tile_downloader.md`. Consumers MUST read that file — not this task spec — to discover the interface.