mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 20:11:15 +00:00
[AZ-320] Add C11 IdempotentRetryTileUploader decorator
Wraps HttpTileUploader (AZ-319) with two bounded retry budgets: - In-call (per-batch) — re-invokes inner on PARTIAL outcome up to `max_in_call_retries` times with capped exponential backoff (`min(base ** attempt_number, cap)`). On exhaustion: surfaces an operator hint via `next_retry_at_s = now + backoff_cap_s`. - Per-tile (cross-call) — atomically increments c6's `tiles.upload_attempts` counter for every rejection; once a tile hits `max_per_tile_attempts` it is forward-only transitioned to `voting_status = upload_giveup` (excluded from `pending_uploads`). Each transition emits FDR `kind="c11.upload.giveup"` plus an ERROR log. C6 contract changes (AZ-303 v1.3.0): - VotingStatus.UPLOAD_GIVEUP added (forward-only from PENDING/TRUSTED). - TileMetadataStore.increment_upload_attempts(tile_id) -> int added with NotImplementedError default for backwards-compat. - Migration 0003_c11_upload_attempts: additive column + widened ck_tiles_voting_status (preserves IS NULL clause). C11 wiring: - C11RetryConfig + disable_retry_decorator on C11Config. - build_tile_uploader wraps in decorator by default; bypass flag returns the bare HttpTileUploader. New `clock` keyword. Cross-component isolation honoured (AZ-507): the decorator declares `_RetryMetadataStoreLike` Protocol cut over c6's TileMetadataStore and references `UPLOAD_GIVEUP` via a local string constant — no c6 imports. Tests: 13 decorator + 1 conformance + 2 factory bypass + AC-6 enum update + alembic head bump + AZ-272 schema fixture. 238 passed across c11/c6/fdr suites; pre-existing perf microbenches unrelated. Code review: PASS_WITH_WARNINGS (5 Low/Informational findings, docs-level or downstream-CI-blocked). See _docs/03_implementation/reviews/batch_41_review.md. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -7,9 +7,9 @@
|
||||
- AZ-TBD-c6-freshness-gate (insert hook + sector classification reader)
|
||||
- AZ-TBD-c6-cache-budget-eviction (LRU candidate enumeration + delete coordination)
|
||||
- TBD at decompose time: E-C10 (AZ-252 — manifest + provisioning), E-C11 (AZ-251 — both `TileDownloader` insert and `TileUploader` reader queries), E-C12 (AZ-253 — operator pre-flight tooling)
|
||||
**Version**: 1.2.0
|
||||
**Version**: 1.3.0
|
||||
**Status**: draft
|
||||
**Last Updated**: 2026-05-12
|
||||
**Last Updated**: 2026-05-13
|
||||
|
||||
## Purpose
|
||||
|
||||
@@ -32,6 +32,7 @@ Defines the typed boundary to the Postgres-backed spatial index over `TileMetada
|
||||
| `lru_candidates` | `(*, max_count: int) -> list[TileMetadata]` | `TileMetadataError` | sync (oldest-`accessed_at`-first; bounded result set) |
|
||||
| `total_disk_bytes` | `() -> int` | `TileMetadataError` | sync (sum of `disk_bytes` column; ≤ 100 ms even at 100k rows) |
|
||||
| `get_by_id` | `(tile_id: TileId) -> Optional[TileMetadata]` | `TileMetadataError` | sync; returns `None` if absent (NOT `TileNotFoundError`) |
|
||||
| `increment_upload_attempts` | `(tile_id: TileId) -> int` | `TileMetadataError`, `TileNotFoundError` | sync; atomic ``UPDATE … RETURNING`` (per-row lock); added in v1.3.0 |
|
||||
|
||||
### DTOs
|
||||
|
||||
@@ -99,7 +100,7 @@ class SectorBoundary:
|
||||
- **I-5 (disk-budget invariant):** `total_disk_bytes` MUST equal `SUM(disk_bytes)` over all rows where `voting_status != REJECTED`. Rejected rows are tombstones — they keep the on-disk file deleted but retain the row for the manifest's content-hash check (D-C10-3).
|
||||
- **I-6 (frozen DTOs):** `Bbox`, `SectorBoundary`, `TileMetadataPersistent` are `@dataclass(frozen=True)`.
|
||||
- **I-7 (transactional writes):** `insert_metadata` is a single transaction over the `tiles` table; the freshness check + the row insert MUST be atomic (a parallel sector-boundary update MUST NOT race the gate).
|
||||
- **I-8 (no silent voting-status downgrade):** `update_voting_status` accepts only forward transitions (`PENDING → TRUSTED`, `PENDING → REJECTED`); a backward transition raises `TileMetadataError`. `TRUSTED → REJECTED` is allowed (covers the cache-poisoning recall path).
|
||||
- **I-8 (no silent voting-status downgrade):** `update_voting_status` accepts only forward transitions (`PENDING → TRUSTED`, `PENDING → REJECTED`, `TRUSTED → REJECTED`, `PENDING → UPLOAD_GIVEUP`, `TRUSTED → UPLOAD_GIVEUP`); a backward transition raises `TileMetadataError`. `TRUSTED → REJECTED` covers the cache-poisoning recall path; the two `UPLOAD_GIVEUP` transitions (added in v1.3.0 by AZ-320) cover the C11 retry decorator's per-tile budget exhaustion. `UPLOAD_GIVEUP → anything` is forbidden — recovery is an out-of-band SQL UPDATE by the operator.
|
||||
- **I-9 (`pending_uploads` is the single source for C11 TileUploader):** the uploader MUST NOT scan the filesystem for pending tiles; it MUST drive its loop off `pending_uploads()`. The metadata store is the bookkeeping.
|
||||
|
||||
## Non-Goals
|
||||
@@ -144,3 +145,4 @@ Same rules as `tile_store.md` § Versioning Rules.
|
||||
| 1.0.0 | 2026-05-10 | Initial contract — 9-method Protocol + LRU/disk-budget extensions + freshness gate semantics + composite-key uniqueness invariant. | autodev (decompose Step 2 of AZ-250 / E-C6) |
|
||||
| 1.1.0 | 2026-05-12 | Non-breaking refinement of Invariant I-1: natural key switched from `(zoom_level, lat, lon, source)` (float-based) to `(zoom_level, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, zero_uuid))` (integer + per-flight separated). Protocol surface unchanged; consumers gain the ability to observe multiple ONBOARD_INGEST rows for the same cell from different flights (required by D-PROJ-2 voting). Driven by `_docs/_process_leftovers/2026-05-12_tile-schema-scenario-analysis.md` and the cross-workspace satellite-provider task `AZ-TBD_tile_identity_uuidv5_bulk_list`. | autodev (AZ-304 batch 27 of cycle 1) |
|
||||
| 1.2.0 | 2026-05-12 | Non-breaking addition of `TileMetadata.location_hash: UUID \| None = None` (cross-source/cross-flight cell-bag identifier; UUIDv5 over `(zoom, tile_x, tile_y)`). Corrected stale references: sector table name (`sector_boundaries` → `sector_classifications`) and Alembic env path (`c6_tile_cache/_alembic/` → `db/migrations/versions/`). Protocol surface unchanged; existing constructors continue to work because the field defaults to `None`. Shipped by AZ-304 alongside the additive `0002_c6_tile_identity_and_lru` migration. | autodev (AZ-304 batch 27 of cycle 1) |
|
||||
| 1.3.0 | 2026-05-13 | Non-breaking addition of (a) `VotingStatus.UPLOAD_GIVEUP` terminal state, (b) two new forward transitions (`PENDING → UPLOAD_GIVEUP`, `TRUSTED → UPLOAD_GIVEUP`) under Invariant I-8, (c) the `increment_upload_attempts(tile_id) -> int` Protocol method (atomic per-row UPDATE … RETURNING), and (d) the `tiles.upload_attempts INTEGER NOT NULL DEFAULT 0` column. The Protocol method body raises `NotImplementedError` so legacy duck-typed impls keep their conformance — production wiring uses `PostgresFilesystemStore` which ships the SQL. `pending_uploads()` now also excludes `voting_status = upload_giveup`. Shipped by AZ-320 (C11 retry decorator) alongside the additive `0003_c11_upload_attempts` migration. | autodev (AZ-320 batch 41 of cycle 1) |
|
||||
|
||||
Reference in New Issue
Block a user