[AZ-316] Implement C11 HttpTileDownloader (batch 40)

Lands the operator-side pre-flight download path: authenticated
httpx GETs against satellite-provider, RESTRICT-SAT-4 (>= 0.5 m/px)
enforcement at the C11 boundary, c6 writes via consumer-side cuts
(_TileWriterLike, _BudgetEnforcerLike), per-(flight_id, request_hash)
journal under cache_root/.c11/journal/ for idempotent re-runs (AC-8,
AC-12), 429 Retry-After + 5xx exponential backoff handling, fail-fast
on TLS / 401 / 403, and a redacted-bearer auth-header policy.

Architecture:
- AZ-507 cross-component rule held: tile_downloader.py imports zero
  c6 symbols; the composition-root _C6DownloadAdapter in
  runtime_root/c11_factory.py absorbs c6's TileMetadata / TileSource /
  FreshnessLabel / VotingStatus enum assembly.
- Sleep-callable injection (not full Clock) per Batch 39 precedent;
  default routes through WallClock.sleep_until_ns to keep the AZ-398
  invariant intact.
- No FDR records on the download path; spec mandates structured logs
  only (8 log kinds wired: session.start/end, resolution_rejected,
  freshness_rejected_summary, freshness_downgraded, batch.retry,
  provider.failed, budget.exceeded, idempotent_no_op).

Tests: 14 new downloader unit tests covering AC-1..AC-9, AC-11, AC-12
plus throughput NFR + 429 HTTP-date + 429 budget exhaustion; 2 new
TileDownloader Protocol conformance tests (AC-10). Full unit suite:
1420 passed, 80 skipped (env-gated), 0 failed.

Code review: PASS_WITH_WARNINGS (5 Low findings, all documentation
or downstream-blocked). See _docs/03_implementation/reviews/
batch_40_review.md and batch_40_cycle1_report.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 07:01:14 +03:00
parent 3a61a4f5bf
commit 90f4ac78f4
13 changed files with 2513 additions and 62 deletions
@@ -0,0 +1,216 @@
# Batch 40 — Cycle 1 Report
**Date**: 2026-05-13
**Batch**: 40 (single-task batch — C11 download orchestrator)
**Tasks**:
- AZ-316 (C11 TileDownloader, 5pt)
**Total complexity**: 5pt
**Status**: complete; pending transition to "In Testing".
## Scope
Batch 40 lands the production `HttpTileDownloader` — the operator-side
pre-flight path that completes the C11 contract surface (gate + signing
key + uploader were Batches 38/39). It composes consumer-side cuts
over c6's `TileStore` / `TileMetadataStore` / `CacheBudgetEnforcer`
into a single class that:
1. Computes a deterministic `request_hash` over `(flight_id, bbox,
zoom_levels, sector_class, sha256(api_key))` and uses it as the
per-batch journal filename suffix
2. Reads the per-`(flight_id, request_hash)` journal at
`cache_root/.c11/journal/<flight_id>__<hash>.json`. If a complete
prior run exists, returns `outcome = idempotent_no_op` immediately
(zero GETs, zero writes — AC-8)
3. Otherwise resumes from the journal (skipping previously-completed
tile ids — AC-12)
4. Issues `GET /api/satellite/tiles?…&list-only=true` against
`satellite-provider` to enumerate the bbox × zoom-level grid and
build a `list[TileSummary]` with per-tile `produced_at` /
`resolution_m_per_px` / `estimated_bytes`
5. Pre-checks cache headroom via the consumer-side cut over c6's
`CacheBudgetEnforcer.reserve_headroom`. On insufficient budget,
wraps c6's `CacheBudgetExhaustedError` into the C11-local
`CacheBudgetExceededError` and aborts before any GET fires (AC-9)
6. Per tile:
- Resolution gate at C11 boundary — if `resolution_m_per_px <
0.5`, increments `tiles_rejected_resolution`, emits a per-tile
WARN log, and skips the tile WITHOUT a GET (AC-2)
- Authenticated GET against the per-tile endpoint with TLS +
`Authorization: Bearer <api_key>` (header redacted in every
log path — AC-11)
- 429 honours `Retry-After` (RFC 7231 integer-seconds AND
HTTP-date forms), with a configurable cumulative wait budget;
budget exhaustion → `RateLimitedError` (AC-5 + spec Risk 1)
- 5xx exponential backoff (1s/2s/4s/8s, 4 retries by config)
→ persistent failure raises `SatelliteProviderError` (AC-6)
- 401 / 403 → fail-fast `SatelliteProviderError` on the FIRST
attempt (AC-7)
- Hands the JPEG bytes + per-tile metadata primitives to the
`_TileWriterLike` cut, which the composition-root adapter
translates into a c6 `TileMetadata` envelope before calling
`tile_store.write_tile` + `tile_metadata_store.insert_metadata`
- Catches the c6 `FreshnessRejectionError` by structural class
name match and increments `tiles_rejected_freshness` without
propagating (AC-3); a single per-batch summary WARN log
surfaces the count
- Maps the post-insert label to the `tiles_downgraded` counter
when the adapter reports `"downgraded"` (AC-4)
7. After every successful tile write, atomically rewrites the
journal (write-then-rename + `fsync` + directory `fsync`),
so a process kill at any point leaves a recoverable state
(AC-12)
8. On batch completion, stamps the journal's `completed_at_iso`
field and returns a `DownloadBatchReport` with the full per-tile
counts envelope plus the `request_hash` for caller correlation
## Architectural decisions
### AZ-507 — consumer-side cuts for c6
The task spec lists `tile_store: TileStore`,
`tile_metadata_store: TileMetadataStore`, and
`budget_enforcer: CacheBudgetEnforcer` as constructor parameters.
A direct `from gps_denied_onboard.components.c6_tile_cache import …`
would violate AZ-507 and trip the AZ-270 lint. Instead,
`tile_downloader.py` declares two local `Protocol` cuts that
duck-type the c6 surfaces it actually uses:
- `_TileWriterLike` — composition-root adapter that hides c6's
`TileMetadata` / `TileSource` / `FreshnessLabel` / `VotingStatus`
enum assembly; takes primitives (zoom/lat/lon, tile_size, capture_ts,
content sha256, sector_class) plus the JPEG bytes and returns a
string label (`"fresh"` / `"downgraded"`).
- `_BudgetEnforcerLike` — single-method cut over
`CacheBudgetEnforcer.reserve_headroom`; exception mapping happens
inside the adapter so the downloader never catches a c6 type.
The composition root (`build_tile_downloader` + the private
`_C6DownloadAdapter` class) is the single layer that may bind
concrete c6 implementations and import c6 enums. `_C6DownloadAdapter`
implements both `Protocol` cuts so the downloader sees a single
backing object.
The c6 freshness-rejection exception is recognised by class-name
match (`exc.__class__.__name__ == "FreshnessRejectionError"` plus an
MRO walk) — see `_is_freshness_rejection` — so the adapter is free to
re-raise the c6 type directly without forcing the downloader to
import the c6 errors module.
### Sleep injection vs. full Clock injection
Same rationale as Batch 39's F2 (recurring deviation): the
downloader only ever needs a sleep primitive (for 429 / 5xx backoff),
never `monotonic_ns` or `time_ns`. Implementation accepts a
`sleep: Callable[[float], None]` defaulting to a `WallClock`-routed
helper, preserving the AZ-398 invariant that `components/` never
calls `time.sleep` directly. Documented in the batch review as F5
(Low).
### Failure paths raise vs. return FAILURE
Same rationale as Batch 39's F1 (recurring deviation): the spec
prose describes `outcome = failure` as a return value for budget /
auth / persistent-5xx scenarios; the implementation raises typed
exceptions (`CacheBudgetExceededError`, `SatelliteProviderError`,
`RateLimitedError`). The exception path still flushes the journal
with `tile_counts` reflecting the partial run so the next operator
invocation resumes. Documented in the batch review as F1 (Low).
### Journal format + atomicity
Per spec Risk 3, the journal is written via the same
write-then-rename + `fsync` pattern the project already uses for the
C9 download journal. Implemented inline rather than adding the
`atomicwrites` library — checking the requirements file shows
`atomicwrites` is not in the project pin (Batch 39 follow-up
confirmed). Staying consistent with existing patterns rather than
introducing a new dependency. Torn / corrupted journals are treated
as "no prior journal" so the batch re-runs from scratch (Risk 3
mitigation).
### Logging only — no FDR records
The spec calls for INFO/WARN/ERROR structured logs (Outcome §,
"INFO log: `kind=…session.start/.end`"). Re-reading the spec end-to-end
confirms NO FDR record kinds are mandated for the download path.
Operator-side runs do not need the same audit-trail durability the
upload path requires (no per-flight signing, no parent-suite
acknowledgement to correlate). The eight log kinds wired into the
implementation cover every transition the operator-tooling CLI
needs to render a post-run summary.
## Files touched
Production:
- `src/gps_denied_onboard/components/c11_tile_manager/_types.py`
(added `SectorClassification`, `DownloadOutcome`, `TileSummary`,
`DownloadRequest`, `DownloadBatchReport`)
- `src/gps_denied_onboard/components/c11_tile_manager/errors.py`
(added `ResolutionRejectionError`, `CacheBudgetExceededError`)
- `src/gps_denied_onboard/components/c11_tile_manager/config.py`
(added 6 download-side fields: `satellite_provider_url`,
`service_api_key`, `download_http_timeout_s`,
`download_max_5xx_retries`, `download_max_retry_after_s`,
`download_resolution_floor_m_per_px`)
- `src/gps_denied_onboard/components/c11_tile_manager/interface.py`
(`TileDownloader` Protocol now has the real signature)
- `src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py`
(new — `HttpTileDownloader`, `request_hash`, `_JournalState`,
`_atomic_write_json`, `_TileWriterLike`, `_BudgetEnforcerLike`,
`_is_freshness_rejection`)
- `src/gps_denied_onboard/components/c11_tile_manager/__init__.py`
(re-exports for download-side public API)
- `src/gps_denied_onboard/runtime_root/c11_factory.py`
(added `build_tile_downloader` + private `_C6DownloadAdapter`)
Tests:
- `tests/unit/c11_tile_manager/test_tile_downloader.py` (new — 14 tests)
- `tests/unit/c11_tile_manager/test_protocol_conformance.py`
(added 2 tests for `TileDownloader` AC-10)
## Test results
`pytest tests/unit -q`:
- **1420 passed**, 80 skipped, 0 failed
- +16 tests vs. Batch 39's 1404 baseline (matches the 14 new downloader
tests + 2 new conformance tests)
- Skips are environment-gated (Docker compose, CUDA, TensorRT,
Tier-2 hardware, `actionlint`); none are AZ-316-related
`pytest tests/unit/c11_tile_manager/`:
- 57 passed (Batch 38 + Batch 39 + Batch 40 combined)
- Downloader: AC-1, AC-2, AC-3, AC-4, AC-5, AC-6, AC-7, AC-8, AC-9,
AC-11, AC-12, plus the throughput NFR, plus 429 HTTP-date form
parsing, plus 429 budget exhaustion → `RateLimitedError`
- Conformance: AC-10 positive (`isinstance(impl, TileDownloader)`)
+ negative (partial fake rejected)
`ReadLints`: clean across all touched files.
## Code review verdict
**PASS_WITH_WARNINGS** — see
`_docs/03_implementation/reviews/batch_40_review.md`. Five Low
findings, all documentation-level or downstream-blocked (recurring
spec-prose vs. typed-exception drift, adapter freshness-label
conservatism pending an AZ-303 ABI extension, deferred Risk-5
lockfile assertion blocked on E-C12, missing `cache_root`
writability pre-validation, recurring Clock-vs-sleep injection
deviation). No code change required for batch close-out.
## Cumulative review
Batch 40 is single-task and closes the C11 contract surface
(downloader + uploader + gate + signing key all wired and tested).
The next cumulative review window covers batches 40-42; that
report will land before Batch 43 starts. Two recurring Low
findings (F1 — failure paths raise vs. return; F5 — sleep vs.
Clock injection) are now visible in three consecutive batch
reviews and should be captured as a single hygiene PBI in the
next cumulative review.