[AZ-320] Add C11 IdempotentRetryTileUploader decorator

Wraps HttpTileUploader (AZ-319) with two bounded retry budgets:

- In-call (per-batch) — re-invokes inner on PARTIAL outcome up to
  `max_in_call_retries` times with capped exponential backoff
  (`min(base ** attempt_number, cap)`). On exhaustion: surfaces an
  operator hint via `next_retry_at_s = now + backoff_cap_s`.
- Per-tile (cross-call) — atomically increments c6's
  `tiles.upload_attempts` counter for every rejection; once a tile
  hits `max_per_tile_attempts` it is forward-only transitioned to
  `voting_status = upload_giveup` (excluded from `pending_uploads`).
  Each transition emits FDR `kind="c11.upload.giveup"` plus an
  ERROR log.

C6 contract changes (AZ-303 v1.3.0):
- VotingStatus.UPLOAD_GIVEUP added (forward-only from PENDING/TRUSTED).
- TileMetadataStore.increment_upload_attempts(tile_id) -> int added
  with NotImplementedError default for backwards-compat.
- Migration 0003_c11_upload_attempts: additive column +
  widened ck_tiles_voting_status (preserves IS NULL clause).

C11 wiring:
- C11RetryConfig + disable_retry_decorator on C11Config.
- build_tile_uploader wraps in decorator by default; bypass flag
  returns the bare HttpTileUploader. New `clock` keyword.

Cross-component isolation honoured (AZ-507): the decorator declares
`_RetryMetadataStoreLike` Protocol cut over c6's TileMetadataStore
and references `UPLOAD_GIVEUP` via a local string constant — no c6
imports.

Tests: 13 decorator + 1 conformance + 2 factory bypass + AC-6 enum
update + alembic head bump + AZ-272 schema fixture. 238 passed across
c11/c6/fdr suites; pre-existing perf microbenches unrelated.

Code review: PASS_WITH_WARNINGS (5 Low/Informational findings,
docs-level or downstream-CI-blocked). See
_docs/03_implementation/reviews/batch_41_review.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 08:48:53 +03:00
parent 90f4ac78f4
commit a06b107fc3
19 changed files with 1788 additions and 21 deletions
@@ -36,7 +36,9 @@ from gps_denied_onboard.components.c11_tile_manager import (
FlightStateSource,
HttpTileDownloader,
HttpTileUploader,
IdempotentRetryTileUploader,
PerFlightKeyManager,
TileUploader,
)
from gps_denied_onboard.config.schema import ConfigError
from gps_denied_onboard.fdr_client import FdrClient, make_fdr_client
@@ -44,6 +46,7 @@ from gps_denied_onboard.logging import get_logger
if TYPE_CHECKING:
from gps_denied_onboard.clock import Clock
from gps_denied_onboard.clock.interface import Clock as ClockProtocol
from gps_denied_onboard.config.schema import Config
__all__ = [
@@ -107,9 +110,10 @@ def build_tile_uploader(
tile_metadata_store: Any,
flight_state_gate: FlightStateGate,
key_manager: PerFlightKeyManager,
clock: ClockProtocol | None = None,
fdr_client: FdrClient | None = None,
) -> HttpTileUploader:
"""Construct a wired :class:`HttpTileUploader` (AZ-319).
) -> TileUploader:
"""Construct a wired :class:`TileUploader` for AZ-319 (+ AZ-320 retry).
The c6 surfaces (``tile_store``, ``tile_metadata_store``) are
consumer-side cuts injected here by the operator-binary
@@ -117,6 +121,16 @@ def build_tile_uploader(
is also caller-owned: production wiring uses one long-lived
:class:`httpx.Client` per process; tests inject
``httpx.Client(transport=httpx.MockTransport(...))``.
By default the bare :class:`HttpTileUploader` is wrapped in the
AZ-320 :class:`IdempotentRetryTileUploader` decorator (per-call +
per-tile bounded retry). The wrapping is suppressed by setting
``config.components['c11_tile_manager'].disable_retry_decorator =
True`` — for low-level debugging or test wiring that wants to
observe the inner uploader directly. The ``clock`` is required
when the decorator is active; if omitted a default
:class:`WallClock` is constructed (matches the production
operator-binary wiring pattern).
"""
block = config.components.get("c11_tile_manager")
@@ -144,7 +158,7 @@ def build_tile_uploader(
if fdr_client is None:
fdr_client = make_fdr_client(_C11_UPLOADER_PRODUCER_ID, config)
logger = get_logger(_C11_UPLOADER_LOGGER)
return HttpTileUploader(
inner = HttpTileUploader(
http_client=http_client,
tile_store=tile_store,
tile_metadata_store=tile_metadata_store,
@@ -155,6 +169,31 @@ def build_tile_uploader(
config=block,
)
if block.disable_retry_decorator:
logger.info(
"AZ-320 retry decorator BYPASSED (config.disable_retry_decorator = true)",
extra={
"component": "c11_tile_manager.tile_uploader",
"kind": "c11.upload.retry.decorator.bypassed",
"kv": {"reason": "config_flag"},
},
)
return inner
if clock is None:
from gps_denied_onboard.clock.wall_clock import WallClock
clock = WallClock()
decorator_logger = get_logger("c11_tile_manager.idempotent_retry")
return IdempotentRetryTileUploader(
inner=inner,
tile_metadata_store=tile_metadata_store,
fdr_client=fdr_client,
logger=decorator_logger,
clock=clock,
config=block.retry,
)
def build_tile_downloader(
config: Config,