Files
gps-denied-onboard/_docs/03_implementation/batch_40_cycle1_report.md
T
Oleksandr Bezdieniezhnykh 90f4ac78f4 [AZ-316] Implement C11 HttpTileDownloader (batch 40)
Lands the operator-side pre-flight download path: authenticated
httpx GETs against satellite-provider, RESTRICT-SAT-4 (>= 0.5 m/px)
enforcement at the C11 boundary, c6 writes via consumer-side cuts
(_TileWriterLike, _BudgetEnforcerLike), per-(flight_id, request_hash)
journal under cache_root/.c11/journal/ for idempotent re-runs (AC-8,
AC-12), 429 Retry-After + 5xx exponential backoff handling, fail-fast
on TLS / 401 / 403, and a redacted-bearer auth-header policy.

Architecture:
- AZ-507 cross-component rule held: tile_downloader.py imports zero
  c6 symbols; the composition-root _C6DownloadAdapter in
  runtime_root/c11_factory.py absorbs c6's TileMetadata / TileSource /
  FreshnessLabel / VotingStatus enum assembly.
- Sleep-callable injection (not full Clock) per Batch 39 precedent;
  default routes through WallClock.sleep_until_ns to keep the AZ-398
  invariant intact.
- No FDR records on the download path; spec mandates structured logs
  only (8 log kinds wired: session.start/end, resolution_rejected,
  freshness_rejected_summary, freshness_downgraded, batch.retry,
  provider.failed, budget.exceeded, idempotent_no_op).

Tests: 14 new downloader unit tests covering AC-1..AC-9, AC-11, AC-12
plus throughput NFR + 429 HTTP-date + 429 budget exhaustion; 2 new
TileDownloader Protocol conformance tests (AC-10). Full unit suite:
1420 passed, 80 skipped (env-gated), 0 failed.

Code review: PASS_WITH_WARNINGS (5 Low findings, all documentation
or downstream-blocked). See _docs/03_implementation/reviews/
batch_40_review.md and batch_40_cycle1_report.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 07:01:14 +03:00

9.8 KiB
Raw Blame History

Batch 40 — Cycle 1 Report

Date: 2026-05-13 Batch: 40 (single-task batch — C11 download orchestrator) Tasks:

  • AZ-316 (C11 TileDownloader, 5pt)

Total complexity: 5pt Status: complete; pending transition to "In Testing".

Scope

Batch 40 lands the production HttpTileDownloader — the operator-side pre-flight path that completes the C11 contract surface (gate + signing key + uploader were Batches 38/39). It composes consumer-side cuts over c6's TileStore / TileMetadataStore / CacheBudgetEnforcer into a single class that:

  1. Computes a deterministic request_hash over (flight_id, bbox, zoom_levels, sector_class, sha256(api_key)) and uses it as the per-batch journal filename suffix
  2. Reads the per-(flight_id, request_hash) journal at cache_root/.c11/journal/<flight_id>__<hash>.json. If a complete prior run exists, returns outcome = idempotent_no_op immediately (zero GETs, zero writes — AC-8)
  3. Otherwise resumes from the journal (skipping previously-completed tile ids — AC-12)
  4. Issues GET /api/satellite/tiles?…&list-only=true against satellite-provider to enumerate the bbox × zoom-level grid and build a list[TileSummary] with per-tile produced_at / resolution_m_per_px / estimated_bytes
  5. Pre-checks cache headroom via the consumer-side cut over c6's CacheBudgetEnforcer.reserve_headroom. On insufficient budget, wraps c6's CacheBudgetExhaustedError into the C11-local CacheBudgetExceededError and aborts before any GET fires (AC-9)
  6. Per tile:
    • Resolution gate at C11 boundary — if resolution_m_per_px < 0.5, increments tiles_rejected_resolution, emits a per-tile WARN log, and skips the tile WITHOUT a GET (AC-2)
    • Authenticated GET against the per-tile endpoint with TLS + Authorization: Bearer <api_key> (header redacted in every log path — AC-11)
    • 429 honours Retry-After (RFC 7231 integer-seconds AND HTTP-date forms), with a configurable cumulative wait budget; budget exhaustion → RateLimitedError (AC-5 + spec Risk 1)
    • 5xx exponential backoff (1s/2s/4s/8s, 4 retries by config) → persistent failure raises SatelliteProviderError (AC-6)
    • 401 / 403 → fail-fast SatelliteProviderError on the FIRST attempt (AC-7)
    • Hands the JPEG bytes + per-tile metadata primitives to the _TileWriterLike cut, which the composition-root adapter translates into a c6 TileMetadata envelope before calling tile_store.write_tile + tile_metadata_store.insert_metadata
    • Catches the c6 FreshnessRejectionError by structural class name match and increments tiles_rejected_freshness without propagating (AC-3); a single per-batch summary WARN log surfaces the count
    • Maps the post-insert label to the tiles_downgraded counter when the adapter reports "downgraded" (AC-4)
  7. After every successful tile write, atomically rewrites the journal (write-then-rename + fsync + directory fsync), so a process kill at any point leaves a recoverable state (AC-12)
  8. On batch completion, stamps the journal's completed_at_iso field and returns a DownloadBatchReport with the full per-tile counts envelope plus the request_hash for caller correlation

Architectural decisions

AZ-507 — consumer-side cuts for c6

The task spec lists tile_store: TileStore, tile_metadata_store: TileMetadataStore, and budget_enforcer: CacheBudgetEnforcer as constructor parameters. A direct from gps_denied_onboard.components.c6_tile_cache import … would violate AZ-507 and trip the AZ-270 lint. Instead, tile_downloader.py declares two local Protocol cuts that duck-type the c6 surfaces it actually uses:

  • _TileWriterLike — composition-root adapter that hides c6's TileMetadata / TileSource / FreshnessLabel / VotingStatus enum assembly; takes primitives (zoom/lat/lon, tile_size, capture_ts, content sha256, sector_class) plus the JPEG bytes and returns a string label ("fresh" / "downgraded").
  • _BudgetEnforcerLike — single-method cut over CacheBudgetEnforcer.reserve_headroom; exception mapping happens inside the adapter so the downloader never catches a c6 type.

The composition root (build_tile_downloader + the private _C6DownloadAdapter class) is the single layer that may bind concrete c6 implementations and import c6 enums. _C6DownloadAdapter implements both Protocol cuts so the downloader sees a single backing object.

The c6 freshness-rejection exception is recognised by class-name match (exc.__class__.__name__ == "FreshnessRejectionError" plus an MRO walk) — see _is_freshness_rejection — so the adapter is free to re-raise the c6 type directly without forcing the downloader to import the c6 errors module.

Sleep injection vs. full Clock injection

Same rationale as Batch 39's F2 (recurring deviation): the downloader only ever needs a sleep primitive (for 429 / 5xx backoff), never monotonic_ns or time_ns. Implementation accepts a sleep: Callable[[float], None] defaulting to a WallClock-routed helper, preserving the AZ-398 invariant that components/ never calls time.sleep directly. Documented in the batch review as F5 (Low).

Failure paths raise vs. return FAILURE

Same rationale as Batch 39's F1 (recurring deviation): the spec prose describes outcome = failure as a return value for budget / auth / persistent-5xx scenarios; the implementation raises typed exceptions (CacheBudgetExceededError, SatelliteProviderError, RateLimitedError). The exception path still flushes the journal with tile_counts reflecting the partial run so the next operator invocation resumes. Documented in the batch review as F1 (Low).

Journal format + atomicity

Per spec Risk 3, the journal is written via the same write-then-rename + fsync pattern the project already uses for the C9 download journal. Implemented inline rather than adding the atomicwrites library — checking the requirements file shows atomicwrites is not in the project pin (Batch 39 follow-up confirmed). Staying consistent with existing patterns rather than introducing a new dependency. Torn / corrupted journals are treated as "no prior journal" so the batch re-runs from scratch (Risk 3 mitigation).

Logging only — no FDR records

The spec calls for INFO/WARN/ERROR structured logs (Outcome §, "INFO log: kind=…session.start/.end"). Re-reading the spec end-to-end confirms NO FDR record kinds are mandated for the download path. Operator-side runs do not need the same audit-trail durability the upload path requires (no per-flight signing, no parent-suite acknowledgement to correlate). The eight log kinds wired into the implementation cover every transition the operator-tooling CLI needs to render a post-run summary.

Files touched

Production:

  • src/gps_denied_onboard/components/c11_tile_manager/_types.py (added SectorClassification, DownloadOutcome, TileSummary, DownloadRequest, DownloadBatchReport)
  • src/gps_denied_onboard/components/c11_tile_manager/errors.py (added ResolutionRejectionError, CacheBudgetExceededError)
  • src/gps_denied_onboard/components/c11_tile_manager/config.py (added 6 download-side fields: satellite_provider_url, service_api_key, download_http_timeout_s, download_max_5xx_retries, download_max_retry_after_s, download_resolution_floor_m_per_px)
  • src/gps_denied_onboard/components/c11_tile_manager/interface.py (TileDownloader Protocol now has the real signature)
  • src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py (new — HttpTileDownloader, request_hash, _JournalState, _atomic_write_json, _TileWriterLike, _BudgetEnforcerLike, _is_freshness_rejection)
  • src/gps_denied_onboard/components/c11_tile_manager/__init__.py (re-exports for download-side public API)
  • src/gps_denied_onboard/runtime_root/c11_factory.py (added build_tile_downloader + private _C6DownloadAdapter)

Tests:

  • tests/unit/c11_tile_manager/test_tile_downloader.py (new — 14 tests)
  • tests/unit/c11_tile_manager/test_protocol_conformance.py (added 2 tests for TileDownloader AC-10)

Test results

pytest tests/unit -q:

  • 1420 passed, 80 skipped, 0 failed
  • +16 tests vs. Batch 39's 1404 baseline (matches the 14 new downloader tests + 2 new conformance tests)
  • Skips are environment-gated (Docker compose, CUDA, TensorRT, Tier-2 hardware, actionlint); none are AZ-316-related

pytest tests/unit/c11_tile_manager/:

  • 57 passed (Batch 38 + Batch 39 + Batch 40 combined)
  • Downloader: AC-1, AC-2, AC-3, AC-4, AC-5, AC-6, AC-7, AC-8, AC-9, AC-11, AC-12, plus the throughput NFR, plus 429 HTTP-date form parsing, plus 429 budget exhaustion → RateLimitedError
  • Conformance: AC-10 positive (isinstance(impl, TileDownloader))
    • negative (partial fake rejected)

ReadLints: clean across all touched files.

Code review verdict

PASS_WITH_WARNINGS — see _docs/03_implementation/reviews/batch_40_review.md. Five Low findings, all documentation-level or downstream-blocked (recurring spec-prose vs. typed-exception drift, adapter freshness-label conservatism pending an AZ-303 ABI extension, deferred Risk-5 lockfile assertion blocked on E-C12, missing cache_root writability pre-validation, recurring Clock-vs-sleep injection deviation). No code change required for batch close-out.

Cumulative review

Batch 40 is single-task and closes the C11 contract surface (downloader + uploader + gate + signing key all wired and tested). The next cumulative review window covers batches 40-42; that report will land before Batch 43 starts. Two recurring Low findings (F1 — failure paths raise vs. return; F5 — sleep vs. Clock injection) are now visible in three consecutive batch reviews and should be captured as a single hygiene PBI in the next cumulative review.