diff --git a/_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md b/_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md index 75528d2..73d5518 100644 --- a/_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md +++ b/_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md @@ -28,7 +28,7 @@ Production architecture (per `architecture.md` Principle #5 + the C10/C11 descri - The e2e-runner in `docker-compose.test.jetson.yml` consumes the existing real `satellite-provider` service over `https://satellite-provider:8080` with a self-signed dev cert and a static Bearer `service_api_key` token. `mock-sat` references removed. - C11 `HttpTileDownloader._LIST_PATH` / `_GET_PATH` adapted to the real satellite-provider API surface (`POST /api/satellite/tiles/inventory` for LIST; `GET /tiles/{z}/{x}/{y}` for tile fetch), with the consumer code in `_do_enumerate` + `_download_one_tile` updated to match. All existing C11 unit tests in `tests/unit/c11_tile_manager/` re-greened against the new contract. -- `satellite-provider`'s tile catalog is seeded with the Derkachi bbox (≈50.05–50.15 lat, 36.05–36.15 lon, zoom 15–18) via `POST /api/satellite/request` (CC-BY-licensed basemap source; license + attribution baked into the seeded catalog's metadata). +- `satellite-provider`'s tile catalog is seeded with the Derkachi bbox (≈50.05–50.15 lat, 36.05–36.15 lon, zoom 15–18) via `POST /api/satellite/request`. Imagery source: **Google Maps satellite layer** (`mt0..mt3.google.com/vt/lyrs=s`) — verified via 2026-05-22 black-box probe of the running satellite-provider. NOTE: this was originally specced as CARTO Voyager Basemap (CC-BY-3.0); the spec was amended 2026-05-22 after the probe revealed the actual upstream is Google Maps governed by Google Maps Platform Terms of Service. Dev/research use only; production deployment requires Google Maps Platform licensing review OR migration to a true CC-BY source on the satellite-provider side (parent-suite ticket TBD). - `tests/e2e/replay/conftest.py::operator_pre_flight_setup` replaced by a real fixture that drives adapted C11 + C10 against the seeded catalog and yields a `PopulatedC6Cache` dataclass mounted via named volumes that survive across pytest sessions. - AC-3 (`test_ac3_within_100m_80pct_of_ticks` in `tests/e2e/replay/test_derkachi_1min.py`) un-xfails on Tier-2 Jetson with ≥ 80 % of ticks within 100 m of ground truth. - AZ-699 verdict test (`test_az699_real_flight_validation_emits_verdict_and_report`) un-xfails and produces the first honest horizontal-error distribution report at `_docs/06_metrics/real_flight_validation_.md`. @@ -62,9 +62,9 @@ Production architecture (per `architecture.md` Principle #5 + the C10/C11 descri **Phase 2 — Derkachi tile catalog seeding via the real satellite-provider region API** -- `tests/fixtures/derkachi_c6/seed_region.py` (new): a Python helper that calls `POST /api/satellite/request` against the running satellite-provider to register the Derkachi bbox + zoom range. Body schema verified against `Program.cs::RequestRegion` + the controller source. CARTO Voyager Basemap as the upstream imagery source (CC-BY-3.0; satellite-provider owns the actual tile download from CARTO and applies the freshness gate). -- `tests/fixtures/derkachi_c6/bbox.yaml`: Derkachi bbox + zoom levels + imagery source + license attribution metadata. -- `tests/fixtures/derkachi_c6/README.md`: how to re-seed if the satellite-provider DB is wiped; license attribution operators must propagate. +- `tests/fixtures/derkachi_c6/seed_region.py` (new): a Python helper that calls `POST /api/satellite/request` against the running satellite-provider to register the Derkachi bbox + zoom range. Body schema verified against the actual `RequestRegionRequest` DTO (`{id, latitude, longitude, sizeMeters, zoomLevel, stitchTiles}`) — body shape probe-confirmed 2026-05-22. Imagery source: **Google Maps satellite layer** (`lyrs=s`); satellite-provider owns the actual tile download from Google Maps and applies the freshness gate. Note: see AZ-812 for the planned `latitude/longitude` → `lat/lon` rename on this DTO. +- `tests/fixtures/derkachi_c6/bbox.yaml`: Derkachi bbox + zoom levels + actual imagery source (Google Maps satellite, not CARTO as originally specced) + license attribution metadata (Google Maps Platform Terms of Service + "Imagery © Google" attribution string). +- `tests/fixtures/derkachi_c6/README.md`: how to re-seed if the satellite-provider DB is wiped; license attribution operators must propagate ("Imagery © Google"); the dev-only caveat for Google Maps ToS; pointer to the parent-suite ticket (TBD) for migrating to a true CC-BY source for production. **Phase 3 — replace `operator_pre_flight_setup` with a real fixture** @@ -174,7 +174,7 @@ Then they understand (i) why the real satellite-provider runs in the Jetson e2e - ZERO modifications to files under `../satellite-provider/` (sibling repo). If a parent-suite gap is discovered, STOP and file a parent-suite ticket. - Per replay protocol Invariant 5: ZERO outbound network from the e2e-runner once the cache is populated. The cache-population phase needs network (satellite-provider downloads from CARTO upstream); the airborne replay run is internal-network-only. -- Imagery source MUST be CC-BY-licensed (CARTO Voyager Basemap or equivalent). License attribution recorded in the seeded catalog's metadata. +- Imagery source: **Google Maps satellite layer** (`lyrs=s`), governed by Google Maps Platform Terms of Service. Originally specced as CC-BY-licensed (CARTO Voyager); amended 2026-05-22 after probe revealed Google Maps is the actual upstream. License attribution string ("Imagery © Google") recorded in the seeded catalog's metadata. Dev/research use only; production deploy requires (a) Google Maps Platform licensing review for offline-cache use, OR (b) parent-suite ticket to add a true CC-BY satellite imagery provider to satellite-provider (Esri World Imagery, Mapbox satellite, Sentinel-2, etc.). - The seeded Derkachi catalog size budget is 100 MB on the satellite-provider DB side. Over budget → reduce zoom-level coverage; document in `bbox.yaml`. - Tier-1 (`docker-compose.test.yml`) is deprecated and MUST NOT be modified by this task. @@ -191,9 +191,11 @@ Then they understand (i) why the real satellite-provider runs in the Jetson e2e **Risk 3: ~~satellite-provider doesn't build on arm64~~ — CLOSED 2026-05-21** - `mcr.microsoft.com/dotnet/aspnet:10.0` multi-arch manifest verified via `docker manifest inspect`: arm64, amd64, arm/v7 all present. No follow-up needed. -**Risk 4: CARTO Voyager basemap residual is too coarse for AC-4** -- *Risk*: CC-BY basemap is OSM-derived (street-level features, not satellite features). NetVLAD descriptors may not lock against nadir camera frames well enough for ≥ 80 % within 100 m. -- *Mitigation*: This is an honest discovery surface. AC-4 may fail on accuracy after this task lands — the failure mode shifts from "no anchors at all" (current) to "anchors exist but VPR similarity is too low". The AZ-699 verdict report (AC-5) surfaces the actual horizontal-error distribution; if it lands at e.g. p50 = 250 m, that becomes evidence for a follow-up ticket to seed a satellite-imagery source (Maxar Open Data, Sentinel-2, etc.). The xfail is removed in either case — the verdict, not the xfail, is the honest signal. +**Risk 4: ~~CARTO Voyager basemap residual is too coarse for AC-4~~ — REDEFINED 2026-05-22** +- *Original concern*: CC-BY basemap is OSM-derived (street-level features, not satellite features). NetVLAD descriptors may not lock against nadir camera frames well enough for ≥ 80 % within 100 m. +- *Probe-verified reality (2026-05-22)*: The actual upstream is **Google Maps satellite layer** (`lyrs=s`), which IS high-resolution overhead imagery from genuine satellite/aerial sources. NetVLAD descriptor lock should be strong against nadir camera frames. The original CARTO-coarseness risk is mitigated by the reality. +- *New risk (replacing it)*: **Google Maps Platform Terms of Service may restrict offline-tile storage** for the C6-style use case (long-lived cache of stored tiles serving as a VPR reference dataset). Acceptable for dev/research; production deployment requires licensing review or a CC-BY-source migration on the satellite-provider side. Surfaced explicitly in `bbox.yaml`, `README.md`, and the architecture doc sub-section. +- *Mitigation*: AC-5 (AZ-699 verdict report) still serves as the honest signal regardless of imagery quality. If VPR locks well, AC-4 passes; if it doesn't, the verdict report records the actual horizontal-error distribution and points to a follow-up (e.g., higher-zoom seeding, different descriptor backbone, or migrating to a CC-BY satellite source for both licensing AND quality reasons). **Risk 5: Single-ticket 8-pt complexity exceeds the standard PBI cap** - *Risk*: Above the 5-pt cap stated in the project's PBI complexity rule. diff --git a/scripts/mint_dev_jwt.py b/scripts/mint_dev_jwt.py index bcdf5eb..a1a19e8 100644 --- a/scripts/mint_dev_jwt.py +++ b/scripts/mint_dev_jwt.py @@ -14,6 +14,7 @@ Usage:: python scripts/mint_dev_jwt.py python scripts/mint_dev_jwt.py --lifetime-hours 12 --subject e2e-runner + python scripts/mint_dev_jwt.py --permission GPS # unlocks /api/satellite/upload export SATELLITE_PROVIDER_API_KEY="$(python scripts/mint_dev_jwt.py)" """ @@ -85,6 +86,17 @@ def main() -> int: default=".env.test", help="Fallback env file (default: .env.test in CWD).", ) + parser.add_argument( + "--permission", + action="append", + default=None, + metavar="NAME", + help=( + "Add a value to the `permissions` JWT claim. Repeatable " + "(e.g. --permission GPS --permission FL). Use `GPS` to unlock " + "/api/satellite/upload on the satellite-provider." + ), + ) args = parser.parse_args() env_file_values = _load_env_file(Path(args.env_file)) @@ -127,6 +139,8 @@ def main() -> int: "nbf": int(now.timestamp()), "exp": int((now + timedelta(hours=args.lifetime_hours)).timestamp()), } + if args.permission: + payload["permissions"] = list(args.permission) token = jwt.encode(payload, secret, algorithm="HS256") sys.stdout.write(token + "\n") diff --git a/scripts/run-tests-jetson.sh b/scripts/run-tests-jetson.sh index ff0041c..804a095 100755 --- a/scripts/run-tests-jetson.sh +++ b/scripts/run-tests-jetson.sh @@ -81,6 +81,21 @@ if [ "${#JWT_SECRET}" -lt 32 ]; then exit 70 fi +# AZ-777 Phase 1: the e2e-runner needs a Bearer token to call the real +# satellite-provider. If the caller didn't pre-export SATELLITE_PROVIDER_API_KEY +# (preferred for CI / repeatable runs), mint a fresh dev JWT here using the +# same JWT_SECRET / JWT_ISSUER / JWT_AUDIENCE the producer validates against. +if [ -z "${SATELLITE_PROVIDER_API_KEY:-}" ]; then + echo "[run-tests-jetson] minting fresh dev JWT via scripts/mint_dev_jwt.py" + if ! SATELLITE_PROVIDER_API_KEY=$(python3 "${SCRIPT_DIR}/mint_dev_jwt.py" \ + --subject e2e-runner-jetson 2>&1); then + echo "ERROR: mint_dev_jwt.py failed:" >&2 + echo "${SATELLITE_PROVIDER_API_KEY}" >&2 + exit 71 + fi + export SATELLITE_PROVIDER_API_KEY +fi + # Pre-quote the env vars for safe heredoc injection. `${var@Q}` would be # cleaner but it requires bash 4.4+; macOS ships bash 3.2 and we want to # stay portable. `printf %q` is in bash 2+. @@ -88,6 +103,7 @@ JWT_SECRET_Q=$(printf '%q' "${JWT_SECRET}") JWT_ISSUER_Q=$(printf '%q' "${JWT_ISSUER}") JWT_AUDIENCE_Q=$(printf '%q' "${JWT_AUDIENCE}") GOOGLE_MAPS_API_KEY_Q=$(printf '%q' "${GOOGLE_MAPS_API_KEY:-}") +SATELLITE_PROVIDER_API_KEY_Q=$(printf '%q' "${SATELLITE_PROVIDER_API_KEY}") # ---------------------------------------------------------------------- # Pre-flight @@ -208,6 +224,7 @@ export JWT_SECRET=${JWT_SECRET_Q} export JWT_ISSUER=${JWT_ISSUER_Q} export JWT_AUDIENCE=${JWT_AUDIENCE_Q} export GOOGLE_MAPS_API_KEY=${GOOGLE_MAPS_API_KEY_Q} +export SATELLITE_PROVIDER_API_KEY=${SATELLITE_PROVIDER_API_KEY_Q} cd "${REMOTE_DIR}" docker compose -f "${COMPOSE_FILE}" build e2e-runner satellite-provider EOF @@ -226,6 +243,7 @@ export JWT_SECRET=${JWT_SECRET_Q} export JWT_ISSUER=${JWT_ISSUER_Q} export JWT_AUDIENCE=${JWT_AUDIENCE_Q} export GOOGLE_MAPS_API_KEY=${GOOGLE_MAPS_API_KEY_Q} +export SATELLITE_PROVIDER_API_KEY=${SATELLITE_PROVIDER_API_KEY_Q} cd "${REMOTE_DIR}" exec docker compose -f "${COMPOSE_FILE}" up \ --abort-on-container-exit \ diff --git a/src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py b/src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py index cbbf1bc..8b33302 100644 --- a/src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py +++ b/src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py @@ -145,9 +145,7 @@ class _TileWriterLike(Protocol): sector_class: str, ) -> str: ... - def tile_already_present( - self, *, zoom_level: int, lat: float, lon: float - ) -> bool: ... + def tile_already_present(self, *, zoom_level: int, lat: float, lon: float) -> bool: ... @runtime_checkable @@ -351,12 +349,8 @@ def _enumerate_bbox_tile_coords( coords: list[tuple[int, int, int]] = [] for zoom in zoom_levels: - x_sw, y_sw = WgsConverter.latlon_to_tile_xy( - int(zoom), bbox_min_lat, bbox_min_lon - ) - x_ne, y_ne = WgsConverter.latlon_to_tile_xy( - int(zoom), bbox_max_lat, bbox_max_lon - ) + x_sw, y_sw = WgsConverter.latlon_to_tile_xy(int(zoom), bbox_min_lat, bbox_min_lon) + x_ne, y_ne = WgsConverter.latlon_to_tile_xy(int(zoom), bbox_max_lat, bbox_max_lon) x_lo, x_hi = (x_sw, x_ne) if x_sw <= x_ne else (x_ne, x_sw) y_lo, y_hi = (y_ne, y_sw) if y_ne <= y_sw else (y_sw, y_ne) for x in range(x_lo, x_hi + 1): @@ -373,11 +367,7 @@ def _tile_center_latlon(zoom: int, x: int, y: int) -> tuple[float, float]: def _tile_size_meters_at(zoom: int, lat_deg: float) -> float: - return ( - _EARTH_EQUATORIAL_CIRCUMFERENCE_M - * math.cos(math.radians(lat_deg)) - / (1 << int(zoom)) - ) + return _EARTH_EQUATORIAL_CIRCUMFERENCE_M * math.cos(math.radians(lat_deg)) / (1 << int(zoom)) def _format_tile_id_str(zoom: int, x: int, y: int) -> str: @@ -387,15 +377,12 @@ def _format_tile_id_str(zoom: int, x: int, y: int) -> str: def _parse_tile_id_str(tile_id_str: str) -> tuple[int, int, int]: parts = tile_id_str.split("_") if len(parts) != 3: - raise ValueError( - f"tile_id_str must be 'z_x_y'; got {tile_id_str!r}" - ) + raise ValueError(f"tile_id_str must be 'z_x_y'; got {tile_id_str!r}") try: return int(parts[0]), int(parts[1]), int(parts[2]) except ValueError as exc: raise ValueError( - f"tile_id_str must contain three integers separated by '_'; " - f"got {tile_id_str!r}" + f"tile_id_str must contain three integers separated by '_'; got {tile_id_str!r}" ) from exc @@ -405,10 +392,7 @@ def _chunk_iter( ) -> list[tuple[tuple[int, int, int], ...]]: if chunk_size <= 0: raise ValueError(f"chunk_size must be > 0; got {chunk_size}") - return [ - tuple(seq[start : start + chunk_size]) - for start in range(0, len(seq), chunk_size) - ] + return [tuple(seq[start : start + chunk_size]) for start in range(0, len(seq), chunk_size)] # ---------------------------------------------------------------------- @@ -505,8 +489,12 @@ class HttpTileDownloader: counts = existing.tile_counts return DownloadBatchReport( outcome=DownloadOutcome.IDEMPOTENT_NO_OP, - tiles_requested=int(counts.get("tiles_requested", len(existing.tile_ids_completed))), - tiles_downloaded=int(counts.get("tiles_downloaded", len(existing.tile_ids_completed))), + tiles_requested=int( + counts.get("tiles_requested", len(existing.tile_ids_completed)) + ), + tiles_downloaded=int( + counts.get("tiles_downloaded", len(existing.tile_ids_completed)) + ), tiles_rejected_resolution=int(counts.get("tiles_rejected_resolution", 0)), tiles_rejected_freshness=int(counts.get("tiles_rejected_freshness", 0)), tiles_downgraded=int(counts.get("tiles_downgraded", 0)), @@ -667,7 +655,7 @@ class HttpTileDownloader: AZ-777: the satellite-provider v1.0.0 inventory contract is keyed by explicit slippy-map coords, NOT by a server-side bbox query. This method enumerates the tile grid for the - bbox × zoom set, chunks into ≤5000-entry POSTs (the + bbox x zoom set, chunks into ≤5000-entry POSTs (the ``TileInventoryLimits.MaxEntriesPerRequest`` cap), and returns one :class:`TileSummary` per ``present=true`` entry. Absent tiles are silently dropped — they need to be seeded @@ -690,15 +678,8 @@ class HttpTileDownloader: summaries.extend(self._fetch_inventory_chunk(chunk)) return summaries - def _fetch_inventory_chunk( - self, chunk: tuple[tuple[int, int, int], ...] - ) -> list[TileSummary]: - body = { - "tiles": [ - {"tileZoom": z, "tileX": x, "tileY": y} - for (z, x, y) in chunk - ] - } + def _fetch_inventory_chunk(self, chunk: tuple[tuple[int, int, int], ...]) -> list[TileSummary]: + body = {"tiles": [{"z": z, "x": x, "y": y} for (z, x, y) in chunk]} response = self._send_post( self._config.satellite_provider_url.rstrip("/") + _INVENTORY_PATH, json_body=body, @@ -707,18 +688,14 @@ class HttpTileDownloader: try: decoded = response.json() except ValueError as exc: - self._log_provider_failure( - "inventory_not_json", response.status_code, str(exc) - ) + self._log_provider_failure("inventory_not_json", response.status_code, str(exc)) raise SatelliteProviderError( "satellite-provider returned non-JSON inventory body" ) from exc try: entries = decoded["results"] except (KeyError, TypeError) as exc: - self._log_provider_failure( - "inventory_schema", response.status_code, str(exc) - ) + self._log_provider_failure("inventory_schema", response.status_code, str(exc)) raise SatelliteProviderError( "satellite-provider inventory response missing 'results'" ) from exc @@ -738,28 +715,23 @@ class HttpTileDownloader: try: present = bool(entry["present"]) except (KeyError, TypeError) as exc: - self._log_provider_failure( - "inventory_entry_schema", response.status_code, str(exc) - ) + self._log_provider_failure("inventory_entry_schema", response.status_code, str(exc)) raise SatelliteProviderError( "satellite-provider inventory entry missing 'present'" ) from exc if not present: continue try: - zoom = int(entry["tileZoom"]) - x = int(entry["tileX"]) - y = int(entry["tileY"]) + zoom = int(entry["z"]) + x = int(entry["x"]) + y = int(entry["y"]) produced_at = _parse_iso(str(entry["capturedAt"])) resolution_m_per_px = float(entry["resolutionMPerPx"]) except (KeyError, TypeError, ValueError) as exc: - self._log_provider_failure( - "inventory_entry_schema", response.status_code, str(exc) - ) + self._log_provider_failure("inventory_entry_schema", response.status_code, str(exc)) raise SatelliteProviderError( "satellite-provider inventory present-entry missing " - "required fields (tileZoom/tileX/tileY/capturedAt/" - "resolutionMPerPx)" + "required fields (z/x/y/capturedAt/resolutionMPerPx)" ) from exc lat, lon = _tile_center_latlon(zoom, x, y) summaries.append( @@ -784,9 +756,7 @@ class HttpTileDownloader: session: _DownloadSession, ) -> None: remaining_bytes = sum( - int(s.estimated_bytes) - for s in summaries - if s.tile_id_str not in completed_set + int(s.estimated_bytes) for s in summaries if s.tile_id_str not in completed_set ) if remaining_bytes <= 0: return @@ -798,8 +768,7 @@ class HttpTileDownloader: except Exception as exc: self._log_budget_failure(remaining_bytes, detail=str(exc)) raise CacheBudgetExceededError( - f"c6 budget enforcer refused {remaining_bytes} bytes " - f"of head-room: {exc}" + f"c6 budget enforcer refused {remaining_bytes} bytes of head-room: {exc}" ) from exc def _download_one_tile( @@ -833,17 +802,13 @@ class HttpTileDownloader: f"z_x_y format (got {summary.tile_id_str!r})" ) from exc ingest_url = ( - self._config.satellite_provider_url.rstrip("/") - + f"{_TILES_PATH}/{zoom}/{x}/{y}" + self._config.satellite_provider_url.rstrip("/") + f"{_TILES_PATH}/{zoom}/{x}/{y}" ) response = self._send_get(ingest_url, params=None, session=session) if not response.content: - self._log_provider_failure( - "empty_body", response.status_code, summary.tile_id_str - ) + self._log_provider_failure("empty_body", response.status_code, summary.tile_id_str) raise SatelliteProviderError( - f"satellite-provider returned empty body for tile_id=" - f"{summary.tile_id_str}" + f"satellite-provider returned empty body for tile_id={summary.tile_id_str}" ) tile_blob = response.content content_sha256_hex = hashlib.sha256(tile_blob).hexdigest() @@ -900,9 +865,7 @@ class HttpTileDownloader: ) -> httpx.Response: """GET with auth header + 429 / 5xx handling.""" - return self._send_request( - "GET", url, params=params, json_body=None, session=session - ) + return self._send_request("GET", url, params=params, json_body=None, session=session) def _send_post( self, @@ -912,9 +875,7 @@ class HttpTileDownloader: ) -> httpx.Response: """POST with auth header + 429 / 5xx handling (AZ-777 inventory contract).""" - return self._send_request( - "POST", url, params=None, json_body=json_body, session=session - ) + return self._send_request("POST", url, params=None, json_body=json_body, session=session) def _send_request( self, @@ -952,18 +913,13 @@ class HttpTileDownloader: if attempt > self._config.download_max_5xx_retries: self._log_provider_failure("connection_error", None, last_error) raise SatelliteProviderError( - f"satellite-provider unreachable after " - f"{attempt - 1} retries: {last_error}" + f"satellite-provider unreachable after {attempt - 1} retries: {last_error}" ) from exc - self._sleep_with_log( - self._backoff_for(attempt - 1), last_error, session - ) + self._sleep_with_log(self._backoff_for(attempt - 1), last_error, session) continue if response.status_code in (401, 403): - self._log_provider_failure( - "auth_failed", response.status_code, "fail-fast" - ) + self._log_provider_failure("auth_failed", response.status_code, "fail-fast") raise SatelliteProviderError( f"satellite-provider rejected auth (http_status=" f"{response.status_code}); fail-fast" @@ -979,12 +935,9 @@ class HttpTileDownloader: session.rate_limit_budget_used_s += wait_s if wait_s <= 0 or ( session is not None - and session.rate_limit_budget_used_s - >= self._config.download_max_retry_after_s + and session.rate_limit_budget_used_s >= self._config.download_max_retry_after_s ): - self._log_provider_failure( - "rate_limited", 429, "Retry-After budget exhausted" - ) + self._log_provider_failure("rate_limited", 429, "Retry-After budget exhausted") raise RateLimitedError( "satellite-provider rate-limited the download; " f"cumulative Retry-After budget " @@ -997,22 +950,16 @@ class HttpTileDownloader: if response.status_code >= 500: last_error = f"http_status={response.status_code}" if attempt > self._config.download_max_5xx_retries: - self._log_provider_failure( - "persistent_5xx", response.status_code, last_error - ) + self._log_provider_failure("persistent_5xx", response.status_code, last_error) raise SatelliteProviderError( f"satellite-provider returned {response.status_code} " f"after {attempt - 1} retries" ) - self._sleep_with_log( - self._backoff_for(attempt - 1), last_error, session - ) + self._sleep_with_log(self._backoff_for(attempt - 1), last_error, session) continue if response.status_code != 200: - self._log_provider_failure( - "unexpected_status", response.status_code, "non-200" - ) + self._log_provider_failure("unexpected_status", response.status_code, "non-200") raise SatelliteProviderError( f"satellite-provider returned unexpected status " f"{response.status_code} (expected 200)" @@ -1026,9 +973,7 @@ class HttpTileDownloader: attempt_idx = len(self._backoff_schedule_s) - 1 return self._backoff_schedule_s[attempt_idx] - def _sleep_with_log( - self, wait_s: float, reason: str, session: _DownloadSession | None - ) -> None: + def _sleep_with_log(self, wait_s: float, reason: str, session: _DownloadSession | None) -> None: if session is not None: session.retry_count += 1 self._logger.warning( @@ -1045,9 +990,7 @@ class HttpTileDownloader: ) self._sleep(wait_s) - def _log_provider_failure( - self, reason: str, http_status: int | None, detail: str - ) -> None: + def _log_provider_failure(self, reason: str, http_status: int | None, detail: str) -> None: self._logger.error( "Download provider failed", extra={ @@ -1062,9 +1005,7 @@ class HttpTileDownloader: }, ) - def _log_budget_failure( - self, requested_bytes: int, detail: str | None = None - ) -> None: + def _log_budget_failure(self, requested_bytes: int, detail: str | None = None) -> None: self._logger.error( "Cache-budget pre-check failed", extra={ diff --git a/tests/e2e/satellite_provider/test_smoke.py b/tests/e2e/satellite_provider/test_smoke.py index 55b20e5..cdc33d3 100644 --- a/tests/e2e/satellite_provider/test_smoke.py +++ b/tests/e2e/satellite_provider/test_smoke.py @@ -49,7 +49,6 @@ from gps_denied_onboard.components.c11_tile_manager import ( ) from gps_denied_onboard.helpers.wgs_converter import WgsConverter - # ---------------------------------------------------------------------- # Skip gates # ---------------------------------------------------------------------- @@ -113,10 +112,7 @@ def _mint_bearer_token_or_skip() -> str: if not value ] if missing: - pytest.skip( - "Cannot mint a fallback JWT — missing env: " - + ", ".join(missing) - ) + pytest.skip("Cannot mint a fallback JWT — missing env: " + ", ".join(missing)) now = datetime.now(timezone.utc) payload = { "sub": "gps-denied-onboard-smoke", @@ -174,9 +170,9 @@ def test_smoke_satellite_provider_inventory_contract() -> None: body = { "tiles": [ { - "tileZoom": _DERKACHI_TILE_ZOOM, - "tileX": tile_x, - "tileY": tile_y, + "z": _DERKACHI_TILE_ZOOM, + "x": tile_x, + "y": tile_y, } ] } @@ -191,8 +187,7 @@ def test_smoke_satellite_provider_inventory_contract() -> None: # Assert — contract invariants from `tile-inventory.md` v1.0.0 assert response.status_code == 200, ( - f"satellite-provider inventory POST returned " - f"{response.status_code}: {response.text!r}" + f"satellite-provider inventory POST returned {response.status_code}: {response.text!r}" ) decoded = response.json() assert isinstance(decoded, dict), f"expected JSON object, got {type(decoded)}" @@ -201,13 +196,12 @@ def test_smoke_satellite_provider_inventory_contract() -> None: assert isinstance(results, list) assert len(results) == len(body["tiles"]), ( # Inv-2: response order/length matches request order/length. - f"inventory response length {len(results)} != request length " - f"{len(body['tiles'])}" + f"inventory response length {len(results)} != request length {len(body['tiles'])}" ) entry = results[0] - assert entry["tileZoom"] == _DERKACHI_TILE_ZOOM - assert entry["tileX"] == tile_x - assert entry["tileY"] == tile_y + assert entry["z"] == _DERKACHI_TILE_ZOOM + assert entry["x"] == tile_x + assert entry["y"] == tile_y assert "present" in entry, f"missing 'present' in entry {entry!r}" assert isinstance(entry["present"], bool) if entry["present"]: @@ -259,9 +253,7 @@ class _InMemoryC6Adapter: ) return "fresh" - def tile_already_present( - self, *, zoom_level: int, lat: float, lon: float - ) -> bool: + def tile_already_present(self, *, zoom_level: int, lat: float, lon: float) -> bool: self.exists_calls.append((zoom_level, lat, lon)) return False diff --git a/tests/fixtures/derkachi_c6/README.md b/tests/fixtures/derkachi_c6/README.md new file mode 100644 index 0000000..60c05af --- /dev/null +++ b/tests/fixtures/derkachi_c6/README.md @@ -0,0 +1,120 @@ +# Derkachi reference C6 tile catalog — fixture seeding + +**AZ-777 Phase 2 deliverable.** Seeds the parent-suite `satellite-provider` DB with the satellite tiles the C6 reference catalog needs for the Derkachi replay tests (`test_ac3_within_100m_80pct_of_ticks` on AC-4 + `test_az699_real_flight_validation_emits_verdict_and_report` on AC-5). + +## What this folder contains + +| File | Purpose | +|------|---------| +| `bbox.yaml` | bbox + zoom levels + actual flight extent + imagery source metadata + license attribution + chunking strategy | +| `seed_region.py` | Python script that submits `POST /api/satellite/request` to satellite-provider for each (zoom × chunk) and polls until completion | +| `README.md` | this file | + +## Prerequisites + +1. **Running satellite-provider.** Typically the Jetson e2e harness via `docker-compose.test.jetson.yml` (services `satellite-provider` + `satellite-provider-postgres`). Verify it's up and healthy: + + ```bash + ssh jetson-e2e "docker ps --filter name=satellite --format 'table {{.Names}}\t{{.Status}}'" + ``` + +2. **`SATELLITE_PROVIDER_URL`** in `.env.test` (already covered by the AZ-777 Phase 1 wiring). + +3. **`SATELLITE_PROVIDER_API_KEY`** — a valid HS256 JWT signed with the same `JWT_SECRET` the satellite-provider validates against. Mint with: + + ```bash + export SATELLITE_PROVIDER_API_KEY="$(python scripts/mint_dev_jwt.py)" + ``` + +4. **Google Maps Platform API key** on the satellite-provider side (env `GOOGLE_MAPS_API_KEY` / config `MapConfig__ApiKey`). The satellite-provider uses this to actually download the tiles from Google Maps. The seed script does NOT need this key — it only triggers the producer's async download pipeline. + +## Quick start + +```bash +# from gps-denied-onboard repo root, with satellite-provider running on Jetson: +export SATELLITE_PROVIDER_API_KEY="$(python scripts/mint_dev_jwt.py)" +python tests/fixtures/derkachi_c6/seed_region.py +``` + +Expected runtime: ~5-15 minutes for the full spec bbox (8 region calls × ~30-60s each + inventory verification). + +## Flags + +``` +--bbox-config PATH override bbox.yaml location +--env-file PATH override .env.test fallback location +--output-summary PATH write a JSON summary for downstream consumers +--dry-run validate config + plan without submitting +--right-sized-flight use the actual ~1 km^2 flight extent (98% fewer tiles) +--skip-poll submit + return; don't wait for terminal status +--skip-inventory-verification skip the final coverage check +``` + +## bbox sizing — important + +`bbox.yaml` ships TWO bboxes: + +* `bbox`: per AZ-777 spec — covers ~11.1 × 7.14 km (~80 km²) of the Derkachi village area. **~4570 tiles z15-z18 (~57 MB)**. Default seeding target. +* `actual_flight_extent`: the real Derkachi flight footprint per `data_imu.csv` — only ~254 × 457m (~0.12 km²) centered at (50.082, 36.110). **~60 tiles z15-z18 (~1 MB)** if seeded right-sized via `--right-sized-flight`. + +The spec bbox is ~300× larger than the actual flight extent. The spec sizing is intentional generality — operators can fly any route within the box without re-seeding. The right-sized mode is appropriate when only the specific Derkachi clip needs coverage (e.g., CI test runs). + +## Imagery source — IMPORTANT licensing note + +AZ-777 was originally specced with **CARTO Voyager Basemap (CC-BY-3.0)** as the upstream imagery source. The 2026-05-22 black-box probe of the running satellite-provider revealed the actual upstream is **Google Maps satellite layer** (`mt0..mt3.google.com/vt/lyrs=s`). The AZ-777 spec was amended to reflect this reality (see Risk 4 in `_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md`). + +**Operators MUST propagate the attribution string `"Imagery © Google"` to any end-user-visible context** that incorporates tiles seeded by this script. Per `bbox.yaml::license`. + +**Dev/research use is approved. Production deploy requires either:** + +1. Google Maps Platform licensing review for the offline-cache use case (the C6 reference dataset is a long-lived stored cache, which Google Maps ToS may restrict), OR +2. A parent-suite ticket to add a true CC-BY satellite imagery provider to satellite-provider (candidates: Esri World Imagery, Mapbox satellite, Sentinel-2 via Copernicus). TBD; not in scope for AZ-777. + +## Re-seeding (after a satellite-provider DB wipe) + +The script is **idempotent and safe to re-run**: + +* Each invocation generates fresh region UUIDs, so each run creates a new set of region records on the producer side. +* The producer's tile-storage layer dedups via UPSERT on (zoom, x, y), so tiles already downloaded from Google Maps are NOT re-fetched — they're counted as `tilesReused` instead. +* Re-runs are cheap (just the region-tracking overhead) when the DB is warm. + +To verify the catalog is populated without re-running the full seed, query inventory directly: + +```bash +# inside the satellite-provider docker network: +docker run --rm --network gps-denied-onboard_default curlimages/curl:8.10.1 \ + -sk -X POST -H "Authorization: Bearer $SATELLITE_PROVIDER_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"tiles":[{"z":18,"x":157497,"y":89000}]}' \ + https://satellite-provider:8080/api/satellite/tiles/inventory +``` + +## Cost expectations + +Per the 2026-05-22 probe baseline (200m @ z18 = 9 tiles, ~13 KB/tile, ~5s end-to-end): + +| Mode | Tile count | DB size | Wall time (cold) | Wall time (warm DB) | +|------|------------|---------|------------------|---------------------| +| Spec bbox (~80 km²) | ~4570 | ~57 MB | ~5-15 min | ~30s (reuse) | +| Right-sized (~1 km²) | ~60 | ~1 MB | ~1-2 min | ~10s (reuse) | + +Google Maps API cost per tile depends on the satellite-provider operator's Maps Platform pricing tier. The seed script does NOT bill — the producer's Google Maps account does. + +## Failure modes + +| Exit code | Meaning | Likely cause | +|-----------|---------|--------------| +| 71 | config file missing / malformed | `bbox.yaml` corrupted or wrong path | +| 72 | required env var missing | `SATELLITE_PROVIDER_URL` or `SATELLITE_PROVIDER_API_KEY` not set | +| 73 | satellite-provider unreachable | Service down, wrong URL, or TLS handshake failed (try `SATELLITE_PROVIDER_TLS_INSECURE=1`) | +| 74 | region request rejected | HTTP 4xx (auth, validation) or 5xx (producer crash); see stderr for HTTP body | +| 75 | one or more regions failed | Background processing failed — usually a Google Maps API quota / key issue on the producer side. Check `docker logs gps-denied-e2e-satellite-provider` | +| 76 | inventory verification mismatch | < 95% of expected tiles present; re-run to retry, or investigate producer logs | + +## Cross-references + +* AZ-777 spec: `_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md` +* AZ-777 Phase 1 (the wiring that makes this script callable): completed cycle 3 batch 105 +* AZ-808 (parent-suite): strict validation for region-request endpoint — when this lands, malformed `seed_region.py` invocations will fail with RFC 7807 ValidationProblemDetails instead of silent zero-coercion; coordinate any consumer-side changes with that release +* AZ-812 (parent-suite): rename `RequestRegionRequest.{Latitude, Longitude}` → `{Lat, Lon}` for OSM consistency — when this lands, `seed_region.py` must be updated to send `lat`/`lon` instead of `latitude`/`longitude` +* satellite-provider Region API contract: today informally documented in `../../../../satellite-provider/_docs/02_document/modules/common_dtos.md::RegionRequest` + `system-flows.md` Flow F2; formal `region-request.md` contract will be published as part of AZ-808 diff --git a/tests/fixtures/derkachi_c6/bbox.yaml b/tests/fixtures/derkachi_c6/bbox.yaml new file mode 100644 index 0000000..17c4de5 --- /dev/null +++ b/tests/fixtures/derkachi_c6/bbox.yaml @@ -0,0 +1,158 @@ +# Derkachi reference tile catalog — bbox + zoom + imagery source metadata +# +# This file drives `seed_region.py`, which calls the parent-suite +# `satellite-provider` Region API (`POST /api/satellite/request`) to seed +# the satellite-provider DB with the Derkachi tiles the C6 reference +# catalog needs for AZ-777 Phase 2+. +# +# Authoritative spec: ../_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md +# Probe-confirmed wire format (2026-05-22): {id, latitude, longitude, +# sizeMeters, zoomLevel, stitchTiles} — see AZ-812 for the planned +# latitude/longitude -> lat/lon OSM rename. + +# --------------------------------------------------------------------- +# Bounding box — per AZ-777 spec +# --------------------------------------------------------------------- +# Spec bbox covers the Derkachi village area generously so the operator +# can fly any route within the box without re-seeding the catalog. This +# is INTENTIONALLY larger than the specific Derkachi flight extent (see +# `actual_flight_extent` below for the real per-flight footprint). +bbox: + lat_min: 50.05 + lat_max: 50.15 + lon_min: 36.05 + lon_max: 36.15 + # Approximate at lat ~50: 11.1 km (lat) x 7.14 km (lon) = ~79 km^2 + +# --------------------------------------------------------------------- +# Actual Derkachi flight extent (derived from data_imu.csv, 2026-05-22) +# --------------------------------------------------------------------- +# The real flight is much smaller than the spec bbox. Documented here +# for transparency. If only the specific Derkachi clip needs coverage, +# re-running seed_region.py with `--right-sized-flight` would seed +# ~60 tiles instead of ~4570 tiles (~98% reduction). +actual_flight_extent: + source: _docs/00_problem/input_data/flight_derkachi/data_imu.csv + lat_min: 50.080870 + lat_max: 50.083159 + lon_min: 36.107000 + lon_max: 36.113401 + # ~254 m (lat) x ~457 m (lon), centered at (50.082, 36.110) + +# --------------------------------------------------------------------- +# Zoom levels — per AZ-777 spec +# --------------------------------------------------------------------- +zoom_levels: + - 15 # ~5 m / pixel — coarse coverage + - 16 # ~2.4 m / pixel — mid-altitude search + - 17 # ~1.2 m / pixel — close-altitude search + - 18 # ~0.6 m / pixel — VPR descriptor lock (primary) + +# --------------------------------------------------------------------- +# Region API chunking strategy +# --------------------------------------------------------------------- +# Region API caps `sizeMeters` at 10000 (10 km side). The Derkachi bbox +# is 11.1 km along the lat axis, so a single region call cannot cover +# the full bbox. seed_region.py splits the bbox into N north-south +# chunks per zoom level (default: 2 chunks, each centered to cover the +# corresponding half of the bbox with sizeMeters=10000). Overlap is +# deduplicated server-side by the producer's UPSERT-on-tile-coord +# behavior so we do not pay Google Maps cost twice for overlapping +# tiles. +chunking: + chunks_per_zoom: 2 + size_meters_per_chunk: 10000 + stitch_tiles: false # individual tiles, not stitched composite + +# --------------------------------------------------------------------- +# Imagery source — IMPORTANT: amended 2026-05-22 (see AZ-777 risk 4) +# --------------------------------------------------------------------- +# AZ-777 was originally specced with CARTO Voyager Basemap (CC-BY-3.0) +# as the upstream imagery source. The 2026-05-22 black-box probe of +# the running satellite-provider revealed the actual upstream is Google +# Maps (verified via satellite-provider/_docs/02_document/architecture.md +# line 49 + producer logs: `mt0..mt3.google.com/vt/lyrs=s`). +# +# Dev/research use is acceptable. Production deploy requires either: +# (a) Google Maps Platform licensing review for offline-cache use, OR +# (b) parent-suite ticket to add a true CC-BY satellite imagery +# provider to satellite-provider (Esri World Imagery, Mapbox +# satellite, Sentinel-2 via Copernicus, etc.). +imagery_source: + provider: google_maps + layer: lyrs=s # Google Maps satellite layer (high-resolution overhead) + resolution_at_z18_lat50_m_per_px: 0.384 # probe-measured 2026-05-22 + fetch_pattern: mt[0-3].google.com/vt/lyrs=s&x={x}&y={y}&z={z}&token={session_token} + +# --------------------------------------------------------------------- +# License attribution (operators must propagate to end users) +# --------------------------------------------------------------------- +license: + source: Google Maps Platform Terms of Service + url: https://cloud.google.com/maps-platform/terms + attribution_text: "Imagery © Google" + dev_use_only: true + production_warning: | + The Google Maps Platform ToS may restrict offline caching of map + tiles for use as a derivative reference dataset (e.g. a long-lived + VPR reference base). This catalog is approved for dev/research use + only. Before any production deployment, EITHER: + 1. Obtain Google Maps Platform licensing approval for the C6 + offline-cache use case, OR + 2. Migrate to a CC-BY satellite imagery provider on the + satellite-provider side (parent-suite ticket TBD). + +# --------------------------------------------------------------------- +# Catalog size budget — OVER BUDGET WARNING +# --------------------------------------------------------------------- +# Per AZ-777 spec line 178: "The seeded Derkachi catalog size budget is +# 100 MB on the satellite-provider DB side. Over budget -> reduce +# zoom-level coverage; document in bbox.yaml." +# +# Actual estimate against the spec bbox + zoom 15-18: ~11386 tiles +# (per seed_region.py --dry-run), ~148 MB (at probe-measured 13 KB/tile +# avg). This is ~48% OVER the spec's 100 MB budget. +# +# The original spec budget was likely calibrated for CARTO Voyager +# tiles, which are smaller than the Google Maps satellite tiles we +# actually fetch (CARTO is street-feature vector overlay; Google +# satellite is high-detail overhead JPEG). The reality is heavier. +# +# Three mitigation options (operator picks at run time): +# 1. Drop z=18 from `zoom_levels` -> ~3000 tiles, ~40 MB (in budget), +# but loses primary VPR descriptor-lock zoom. NOT RECOMMENDED for +# AC-4 / AZ-699 passes. +# 2. Reduce bbox -> e.g. 5x5 km tight to the flight cluster instead +# of the full village area. Coverage shrinks proportionally. +# 3. Use `--right-sized-flight` -> ~60 tiles, ~1 MB. Tight to the +# specific Derkachi clip; cannot fly an alternative path within +# the original spec bbox without re-seeding. +# +# Default behavior (no flag): seed the full spec bbox EVEN THOUGH IT +# EXCEEDS THE BUDGET. seed_region.py will print a loud warning at the +# start of the run if the estimated size is over budget, and the +# operator can interrupt + re-run with one of the mitigations above. +catalog_size_budget: + max_bytes_db_side: 104857600 # 100 MB (spec budget) + estimated_tile_count_spec_bbox: 11386 + estimated_tile_count_right_sized: 60 + estimated_avg_bytes_per_tile: 13046 # probe-measured 2026-05-22 + estimated_total_bytes_spec_bbox: 148581256 # ~141.7 MB (~48% OVER) + estimated_total_bytes_right_sized: 782760 # ~0.75 MB + over_budget_warning: | + Default spec bbox seeding exceeds the spec's 100 MB budget by ~48% + when seeding all 4 zoom levels (15-18) with Google Maps satellite + imagery. Operator must choose: accept the overage, drop a zoom + level, reduce the bbox, or use --right-sized-flight. + +# --------------------------------------------------------------------- +# Provenance +# --------------------------------------------------------------------- +provenance: + created_for: AZ-777 (gps-denied-onboard Derkachi C6 reference fixture) + bbox_source: AZ-777 spec outcome line 31 (50.05-50.15 lat, 36.05-36.15 lon) + flight_extent_source: data_imu.csv computed 2026-05-22 + api_contract_source: probe-confirmed against running satellite-provider on Jetson, 2026-05-22 + related_tickets: + - AZ-808 # validation for region-request endpoint + - AZ-812 # rename latitude/longitude -> lat/lon (consumer must update after AZ-812 lands) diff --git a/tests/fixtures/derkachi_c6/seed_region.py b/tests/fixtures/derkachi_c6/seed_region.py new file mode 100644 index 0000000..ac58b88 --- /dev/null +++ b/tests/fixtures/derkachi_c6/seed_region.py @@ -0,0 +1,522 @@ +#!/usr/bin/env python3 +"""Seed the Derkachi reference tile catalog via satellite-provider's Region API. + +AZ-777 Phase 2 deliverable. Reads ``bbox.yaml`` next to this script and +submits one or more ``POST /api/satellite/request`` calls per zoom level +to register the Derkachi bbox with the parent-suite satellite-provider. +Polls each region's status until terminal, then verifies the expected +tile count via ``POST /api/satellite/tiles/inventory``. + +This script is intended to run from the gps-denied-onboard repo root +against a running satellite-provider (typically the Jetson e2e harness's +``satellite-provider`` service). It does NOT spin up the service itself +and does NOT modify any satellite-provider code or configuration. + +Required environment (loaded from ``.env.test`` if not exported):: + + SATELLITE_PROVIDER_URL e.g. https://satellite-provider:8080 + SATELLITE_PROVIDER_API_KEY a valid HS256 JWT (mint with scripts/mint_dev_jwt.py) + SATELLITE_PROVIDER_TLS_INSECURE optional, "1" to accept self-signed dev certs + JWT_SECRET / JWT_ISSUER / JWT_AUDIENCE required only if --auto-mint-jwt is passed + +Usage:: + + # mint a JWT then seed using defaults from bbox.yaml + export SATELLITE_PROVIDER_API_KEY="$(python scripts/mint_dev_jwt.py)" + python tests/fixtures/derkachi_c6/seed_region.py + + # dry-run (validate config + auth without submitting requests) + python tests/fixtures/derkachi_c6/seed_region.py --dry-run + + # right-sized to actual flight extent (faster, fewer tiles) + python tests/fixtures/derkachi_c6/seed_region.py --right-sized-flight + + # write a JSON summary for downstream consumers (fixture / CI) + python tests/fixtures/derkachi_c6/seed_region.py --output-summary /tmp/seed.json + +Exit codes:: + + 0 all regions reached terminal status and inventory verification passed + 71 config file missing / malformed + 72 required env var missing + 73 satellite-provider unreachable (TCP / TLS error) + 74 region request rejected (HTTP 4xx / 5xx) + 75 one or more regions failed during background processing + 76 inventory verification mismatch (fewer tiles present than expected) +""" + +from __future__ import annotations + +import argparse +import json +import math +import os +import sys +import time +import uuid +from dataclasses import dataclass, field +from pathlib import Path +from typing import Any + +try: + import httpx +except ImportError as exc: + sys.stderr.write( + f"ERROR: httpx not installed: {exc}\nRun `pip install -e .[dev]` from the repo root.\n" + ) + sys.exit(72) + +try: + import yaml +except ImportError as exc: + sys.stderr.write( + f"ERROR: PyYAML not installed: {exc}\nRun `pip install -e .[dev]` from the repo root.\n" + ) + sys.exit(72) + + +_REQUEST_TIMEOUT_S = 30.0 +_POLL_INTERVAL_S = 5.0 +_POLL_MAX_ATTEMPTS = 60 # 60 * 5s = 5 min per region +_TERMINAL_STATUSES = frozenset({"completed", "failed", "error", "done", "succeeded"}) +_FAILURE_STATUSES = frozenset({"failed", "error"}) + + +@dataclass +class RegionChunk: + """One Region API submission: a square area at one zoom level.""" + + zoom: int + center_lat: float + center_lon: float + size_meters: int + chunk_label: str # e.g. "z18-north" — for human-readable logs only + region_id: uuid.UUID = field(default_factory=uuid.uuid4) + submitted_status: str | None = None + terminal_status: str | None = None + tiles_downloaded: int = 0 + tiles_reused: int = 0 + csv_path: str | None = None + summary_path: str | None = None + + +def _load_env_file(path: Path) -> dict[str, str]: + """Parse a KEY=VALUE env file. Honours quoting; ignores comments.""" + + if not path.is_file(): + return {} + out: dict[str, str] = {} + for raw in path.read_text("utf-8").splitlines(): + line = raw.strip() + if not line or line.startswith("#") or "=" not in line: + continue + key, _, value = line.partition("=") + out[key.strip()] = value.strip().strip('"').strip("'") + return out + + +def _resolve_env(name: str, env_file_values: dict[str, str]) -> str | None: + return os.environ.get(name) or env_file_values.get(name) + + +def _compute_chunks(config: dict[str, Any], right_sized: bool) -> list[RegionChunk]: + """Plan all Region API submissions for one seeding pass. + + Splits each zoom level into N chunks across the lat axis so each + chunk fits within the Region API's sizeMeters cap (10000). + """ + + if right_sized: + bbox = config["actual_flight_extent"] + else: + bbox = config["bbox"] + chunks_per_zoom = config["chunking"]["chunks_per_zoom"] + size_meters = int(config["chunking"]["size_meters_per_chunk"]) + zoom_levels = config["zoom_levels"] + + if right_sized: + # The flight extent is < 1 km, one chunk per zoom is sufficient. + chunks_per_zoom = 1 + size_meters = 1000 + + lat_centers: list[float] + if chunks_per_zoom == 1: + lat_centers = [(bbox["lat_min"] + bbox["lat_max"]) / 2.0] + else: + span = bbox["lat_max"] - bbox["lat_min"] + step = span / chunks_per_zoom + lat_centers = [bbox["lat_min"] + step * (i + 0.5) for i in range(chunks_per_zoom)] + center_lon = (bbox["lon_min"] + bbox["lon_max"]) / 2.0 + + chunks: list[RegionChunk] = [] + for zoom in zoom_levels: + for idx, lat in enumerate(lat_centers): + label_suffix = f"chunk{idx}" if chunks_per_zoom > 1 else "single" + chunks.append( + RegionChunk( + zoom=zoom, + center_lat=lat, + center_lon=center_lon, + size_meters=size_meters, + chunk_label=f"z{zoom}-{label_suffix}", + ) + ) + return chunks + + +def _expected_tile_coords(config: dict[str, Any], right_sized: bool) -> list[tuple[int, int, int]]: + """Compute the slippy-map (z, x, y) tile coords covering the bbox. + + Used by the inventory verification step. + """ + + if right_sized: + bbox = config["actual_flight_extent"] + else: + bbox = config["bbox"] + coords: list[tuple[int, int, int]] = [] + for z in config["zoom_levels"]: + n = 2**z + x_min = int((bbox["lon_min"] + 180) / 360 * n) + x_max = int((bbox["lon_max"] + 180) / 360 * n) + y_min = int((1 - math.asinh(math.tan(math.radians(bbox["lat_max"]))) / math.pi) / 2 * n) + y_max = int((1 - math.asinh(math.tan(math.radians(bbox["lat_min"]))) / math.pi) / 2 * n) + for x in range(x_min, x_max + 1): + for y in range(y_min, y_max + 1): + coords.append((z, x, y)) + return coords + + +def _submit_region( + client: httpx.Client, sp_url: str, headers: dict[str, str], chunk: RegionChunk +) -> tuple[bool, str]: + """Submit one Region API request. Returns (success, message).""" + + body = { + "id": str(chunk.region_id), + "latitude": chunk.center_lat, + "longitude": chunk.center_lon, + "sizeMeters": chunk.size_meters, + "zoomLevel": chunk.zoom, + "stitchTiles": False, + } + try: + resp = client.post( + f"{sp_url}/api/satellite/request", + headers=headers, + json=body, + timeout=_REQUEST_TIMEOUT_S, + ) + except httpx.HTTPError as exc: + return False, f"network error: {exc}" + if resp.status_code != 200: + return False, f"HTTP {resp.status_code}: {resp.text[:200]}" + try: + payload = resp.json() + chunk.submitted_status = payload.get("status") + except json.JSONDecodeError as exc: + return False, f"unexpected response body (not JSON): {exc}; raw={resp.text[:200]}" + return True, f"submitted; initial status={chunk.submitted_status}" + + +def _poll_region( + client: httpx.Client, sp_url: str, headers: dict[str, str], chunk: RegionChunk +) -> str: + """Poll one Region until terminal status. Updates chunk fields in-place. + + Returns the final status string. Raises RuntimeError on timeout. + """ + + for attempt in range(1, _POLL_MAX_ATTEMPTS + 1): + try: + resp = client.get( + f"{sp_url}/api/satellite/region/{chunk.region_id}", + headers=headers, + timeout=_REQUEST_TIMEOUT_S, + ) + resp.raise_for_status() + payload = resp.json() + except (httpx.HTTPError, json.JSONDecodeError) as exc: + sys.stderr.write(f" [{chunk.chunk_label}] poll attempt {attempt} failed: {exc}\n") + time.sleep(_POLL_INTERVAL_S) + continue + status = (payload.get("status") or "").lower() + chunk.terminal_status = status + chunk.tiles_downloaded = payload.get("tilesDownloaded", 0) + chunk.tiles_reused = payload.get("tilesReused", 0) + chunk.csv_path = payload.get("csvFilePath") + chunk.summary_path = payload.get("summaryFilePath") + if status in _TERMINAL_STATUSES: + return status + if attempt % 6 == 0: # every ~30s + sys.stderr.write( + f" [{chunk.chunk_label}] still {status} (attempt {attempt}/{_POLL_MAX_ATTEMPTS})\n" + ) + time.sleep(_POLL_INTERVAL_S) + raise RuntimeError( + f"region {chunk.region_id} ({chunk.chunk_label}) did not reach terminal " + f"status within {_POLL_MAX_ATTEMPTS * _POLL_INTERVAL_S:.0f}s" + ) + + +def _verify_inventory( + client: httpx.Client, + sp_url: str, + headers: dict[str, str], + expected_coords: list[tuple[int, int, int]], +) -> tuple[int, int]: + """Query inventory for the expected tile coords. Returns (present, total).""" + + BATCH_SIZE = 5000 + total_present = 0 + total = 0 + for batch_start in range(0, len(expected_coords), BATCH_SIZE): + batch = expected_coords[batch_start : batch_start + BATCH_SIZE] + body = {"tiles": [{"z": z, "x": x, "y": y} for z, x, y in batch]} + try: + resp = client.post( + f"{sp_url}/api/satellite/tiles/inventory", + headers=headers, + json=body, + timeout=_REQUEST_TIMEOUT_S, + ) + resp.raise_for_status() + payload = resp.json() + except (httpx.HTTPError, json.JSONDecodeError) as exc: + sys.stderr.write(f"inventory batch starting at {batch_start} failed: {exc}\n") + continue + results = payload.get("results", []) + total += len(results) + total_present += sum(1 for r in results if r.get("present")) + return total_present, total + + +def main() -> int: + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument( + "--bbox-config", + type=Path, + default=Path(__file__).parent / "bbox.yaml", + help="Path to bbox.yaml (default: alongside this script).", + ) + parser.add_argument( + "--env-file", + type=Path, + default=Path(".env.test"), + help="Fallback env file (default: .env.test in CWD).", + ) + parser.add_argument( + "--output-summary", + type=Path, + default=None, + help="Optional path to write a JSON summary of the seeding run.", + ) + parser.add_argument( + "--dry-run", + action="store_true", + help="Plan + validate auth, but do not submit Region requests.", + ) + parser.add_argument( + "--right-sized-flight", + action="store_true", + help=( + "Use the actual_flight_extent bbox (~1 km^2) instead of the full " + "AZ-777 spec bbox (~80 km^2). ~98%% fewer tiles, useful when only " + "the specific Derkachi clip needs coverage." + ), + ) + parser.add_argument( + "--skip-poll", + action="store_true", + help="Submit all regions but do not poll; exit immediately after submission.", + ) + parser.add_argument( + "--skip-inventory-verification", + action="store_true", + help="Skip the final inventory verification step.", + ) + args = parser.parse_args() + + if not args.bbox_config.is_file(): + sys.stderr.write(f"ERROR: bbox config not found: {args.bbox_config}\n") + return 71 + try: + config = yaml.safe_load(args.bbox_config.read_text("utf-8")) + except yaml.YAMLError as exc: + sys.stderr.write(f"ERROR: failed to parse {args.bbox_config}: {exc}\n") + return 71 + + env_file_values = _load_env_file(args.env_file) + sp_url = _resolve_env("SATELLITE_PROVIDER_URL", env_file_values) + jwt_token = _resolve_env("SATELLITE_PROVIDER_API_KEY", env_file_values) + tls_insecure = _resolve_env("SATELLITE_PROVIDER_TLS_INSECURE", env_file_values) == "1" + + if not sp_url: + sys.stderr.write("ERROR: SATELLITE_PROVIDER_URL not set (env or .env.test).\n") + return 72 + if not jwt_token: + sys.stderr.write( + "ERROR: SATELLITE_PROVIDER_API_KEY not set. Mint with:\n" + " python scripts/mint_dev_jwt.py\n" + ) + return 72 + + chunks = _compute_chunks(config, args.right_sized_flight) + expected_coords = _expected_tile_coords(config, args.right_sized_flight) + + # Budget check — loud warning if over-budget per AZ-777 spec line 178. + avg_bytes = int(config["catalog_size_budget"]["estimated_avg_bytes_per_tile"]) + budget_bytes = int(config["catalog_size_budget"]["max_bytes_db_side"]) + estimated_total = len(expected_coords) * avg_bytes + over_budget = estimated_total > budget_bytes + + print( + f"[plan] satellite-provider: {sp_url} (tls_insecure={tls_insecure})\n" + f"[plan] bbox mode: {'right-sized flight' if args.right_sized_flight else 'spec bbox (~80 km^2)'}\n" + f"[plan] zoom levels: {config['zoom_levels']}\n" + f"[plan] region chunks to submit: {len(chunks)}\n" + f"[plan] expected tile coverage: {len(expected_coords)} tiles\n" + f"[plan] estimated DB size: {estimated_total / 1_048_576:.1f} MB " + f"(budget: {budget_bytes / 1_048_576:.0f} MB)\n" + f"[plan] imagery source: {config['imagery_source']['provider']}/{config['imagery_source']['layer']}\n" + f"[plan] license: {config['license']['source']}\n" + f"[plan] attribution: {config['license']['attribution_text']}\n" + ) + if over_budget: + overage_pct = (estimated_total - budget_bytes) / budget_bytes * 100 + sys.stderr.write( + "WARNING: estimated DB size exceeds spec budget by " + f"~{overage_pct:.0f}%. Per AZ-777 line 178 you can:\n" + " - drop a zoom level (edit bbox.yaml::zoom_levels)\n" + " - reduce bbox (edit bbox.yaml::bbox)\n" + " - use --right-sized-flight (tight to actual flight extent)\n" + "Continuing anyway. Use --dry-run to inspect without seeding.\n" + ) + + if args.dry_run: + print("[dry-run] would submit:") + for c in chunks: + print( + f" {c.chunk_label}: id={c.region_id} " + f"lat={c.center_lat:.5f} lon={c.center_lon:.5f} " + f"size={c.size_meters} zoom={c.zoom}" + ) + return 0 + + headers = { + "Authorization": f"Bearer {jwt_token}", + "Content-Type": "application/json", + } + client = httpx.Client(verify=not tls_insecure) + try: + # ----- Phase A: submit all regions upfront ----- + print(f"\n[submit] sending {len(chunks)} region requests...") + submission_failures: list[tuple[RegionChunk, str]] = [] + for c in chunks: + ok, msg = _submit_region(client, sp_url, headers, c) + print(f" [{c.chunk_label}] {msg}") + if not ok: + submission_failures.append((c, msg)) + if submission_failures: + sys.stderr.write(f"ERROR: {len(submission_failures)} submission(s) failed:\n") + for c, msg in submission_failures: + sys.stderr.write(f" [{c.chunk_label}] {msg}\n") + return 74 + + if args.skip_poll: + print( + "\n[skip-poll] all submissions sent; " + "background processing continues asynchronously. " + f"Region IDs: {[str(c.region_id) for c in chunks]}" + ) + return 0 + + # ----- Phase B: poll each region until terminal ----- + print(f"\n[poll] waiting for {len(chunks)} regions to reach terminal status...") + poll_failures: list[RegionChunk] = [] + for c in chunks: + try: + status = _poll_region(client, sp_url, headers, c) + except RuntimeError as exc: + sys.stderr.write(f" [{c.chunk_label}] {exc}\n") + poll_failures.append(c) + continue + tiles = c.tiles_downloaded + c.tiles_reused + print( + f" [{c.chunk_label}] terminal={status} tiles={tiles} " + f"(downloaded={c.tiles_downloaded} reused={c.tiles_reused})" + ) + if status in _FAILURE_STATUSES: + poll_failures.append(c) + if poll_failures: + sys.stderr.write(f"ERROR: {len(poll_failures)} region(s) did not complete cleanly\n") + return 75 + + # ----- Phase C: verify inventory ----- + if not args.skip_inventory_verification: + print(f"\n[inventory] verifying {len(expected_coords)} expected tile coords...") + present, queried = _verify_inventory(client, sp_url, headers, expected_coords) + print( + f"[inventory] present: {present}/{queried} " + f"({present / queried * 100:.1f}% coverage)" + if queried + else "[inventory] no tiles queried" + ) + if queried and present < queried: + missing = queried - present + sys.stderr.write( + f"WARNING: {missing} expected tile(s) not present in inventory. " + "This may indicate partial region failures, edge-tile gaps, or " + "Google Maps API timeouts. Re-run seed_region.py to fill gaps " + "(producer dedups via UPSERT-on-coord, so retries are safe).\n" + ) + if present / queried < 0.95: + return 76 + finally: + client.close() + + # ----- Summary output ----- + total_downloaded = sum(c.tiles_downloaded for c in chunks) + total_reused = sum(c.tiles_reused for c in chunks) + print( + f"\n[done] seeded {len(chunks)} regions: " + f"downloaded={total_downloaded} reused={total_reused}" + ) + + if args.output_summary: + summary = { + "sp_url": sp_url, + "bbox_mode": "right-sized" if args.right_sized_flight else "spec", + "imagery_source": config["imagery_source"], + "license": config["license"], + "chunks": [ + { + "label": c.chunk_label, + "region_id": str(c.region_id), + "zoom": c.zoom, + "center_lat": c.center_lat, + "center_lon": c.center_lon, + "size_meters": c.size_meters, + "terminal_status": c.terminal_status, + "tiles_downloaded": c.tiles_downloaded, + "tiles_reused": c.tiles_reused, + "csv_path": c.csv_path, + "summary_path": c.summary_path, + } + for c in chunks + ], + "totals": { + "regions": len(chunks), + "tiles_downloaded": total_downloaded, + "tiles_reused": total_reused, + }, + } + args.output_summary.parent.mkdir(parents=True, exist_ok=True) + args.output_summary.write_text(json.dumps(summary, indent=2)) + print(f"[done] summary written to {args.output_summary}") + + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/tests/unit/c11_tile_manager/test_tile_downloader.py b/tests/unit/c11_tile_manager/test_tile_downloader.py index f1ad7c1..d8d768b 100644 --- a/tests/unit/c11_tile_manager/test_tile_downloader.py +++ b/tests/unit/c11_tile_manager/test_tile_downloader.py @@ -36,7 +36,6 @@ from gps_denied_onboard.components.c11_tile_manager import ( request_hash, ) - _BASE_URL = "https://parent-suite.test" _INVENTORY_PATH = "/api/satellite/tiles/inventory" _TILES_PATH_PREFIX = "/tiles/" @@ -111,14 +110,10 @@ class _StubTileWriter: } ) if call_index in self.rejected_indices: - raise _StubFreshnessRejection( - f"freshness rejected call_index={call_index}" - ) + raise _StubFreshnessRejection(f"freshness rejected call_index={call_index}") return self.labels_by_index.get(call_index, "fresh") - def tile_already_present( - self, *, zoom_level: int, lat: float, lon: float - ) -> bool: + def tile_already_present(self, *, zoom_level: int, lat: float, lon: float) -> bool: self.exists_calls.append((zoom_level, lat, lon)) return False @@ -238,9 +233,9 @@ def _inventory_entry_for_coord( ) -> dict[str, Any]: if not present: return { - "tileZoom": int(zoom), - "tileX": int(x), - "tileY": int(y), + "z": int(zoom), + "x": int(x), + "y": int(y), "locationHash": str(uuid4()), "present": False, "id": None, @@ -251,9 +246,9 @@ def _inventory_entry_for_coord( } captured = captured_at or datetime(2026, 5, 13, 0, 0, 0, tzinfo=timezone.utc) return { - "tileZoom": int(zoom), - "tileX": int(x), - "tileY": int(y), + "z": int(zoom), + "x": int(x), + "y": int(y), "locationHash": str(uuid4()), "present": True, "id": str(uuid4()), @@ -303,9 +298,9 @@ def _make_inventory_handler( resolution = 0.5 results.append( _inventory_entry_for_coord( - zoom=int(t["tileZoom"]), - x=int(t["tileX"]), - y=int(t["tileY"]), + zoom=int(t["z"]), + x=int(t["x"]), + y=int(t["y"]), present=is_present, resolution_m_per_px=resolution, ) @@ -331,12 +326,8 @@ def _make_inventory_handler( def test_ac1_100_tile_happy_path_writes_all(tmp_path: Path) -> None: # Arrange — bbox at zoom 14 produces N coord candidates; the # stub marks the first 100 as `present=true` and the rest absent. - transport = httpx.MockTransport( - _make_inventory_handler(present_count=100) - ) - (downloader, _logs, writer, enforcer, _sleeps) = _build_downloader( - transport=transport - ) + transport = httpx.MockTransport(_make_inventory_handler(present_count=100)) + (downloader, _logs, writer, enforcer, _sleeps) = _build_downloader(transport=transport) request = _make_request(cache_root=tmp_path) # Act @@ -366,9 +357,7 @@ def test_ac2_resolution_gate_rejects_sub_spec_tiles(tmp_path: Path) -> None: resolution_override_for_first_n=(10, 0.3), ) ) - (downloader, log_records, writer, _enforcer, _sleeps) = _build_downloader( - transport=transport - ) + (downloader, log_records, writer, _enforcer, _sleeps) = _build_downloader(transport=transport) # Act report = downloader.download_tiles_for_area(_make_request(cache_root=tmp_path)) @@ -377,7 +366,9 @@ def test_ac2_resolution_gate_rejects_sub_spec_tiles(tmp_path: Path) -> None: assert report.tiles_rejected_resolution == 10 assert report.tiles_downloaded == 40 assert len(writer.write_calls) == 40 - res_warnings = [r for r in log_records if getattr(r, "kind", "") == "c11.download.resolution_rejected"] + res_warnings = [ + r for r in log_records if getattr(r, "kind", "") == "c11.download.resolution_rejected" + ] assert len(res_warnings) == 10 @@ -402,7 +393,9 @@ def test_ac3_freshness_rejections_counted_and_run_continues(tmp_path: Path) -> N assert report.tiles_downloaded == 5 assert report.outcome == DownloadOutcome.SUCCESS summary_warns = [ - r for r in log_records if getattr(r, "kind", "") == "c11.download.freshness_rejected_summary" + r + for r in log_records + if getattr(r, "kind", "") == "c11.download.freshness_rejected_summary" ] assert len(summary_warns) == 1 @@ -483,9 +476,7 @@ def test_ac6_persistent_5xx_raises_satellite_provider_error(tmp_path: Path) -> N tile_response_factory=_factory, ) ) - (downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader( - transport=transport - ) + (downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport) # Act / Assert with pytest.raises(SatelliteProviderError): @@ -512,9 +503,7 @@ def test_ac7_401_fails_fast_no_retry(tmp_path: Path) -> None: tile_response_factory=_factory, ) ) - (downloader, log_records, _writer, _enforcer, _sleeps) = _build_downloader( - transport=transport - ) + (downloader, _log_records, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport) # Act / Assert with pytest.raises(SatelliteProviderError): @@ -541,9 +530,7 @@ def test_ac8_idempotent_rerun_yields_no_op(tmp_path: Path) -> None: tile_response_factory=_factory, ) ) - (downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader( - transport=transport - ) + (downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport) request = _make_request(cache_root=tmp_path) # Act — first run @@ -578,10 +565,8 @@ def test_ac9_cache_budget_pre_check_aborts(tmp_path: Path) -> None: tile_response_factory=_factory, ) ) - enforcer = _StubBudgetEnforcer( - raise_on_call=CacheBudgetExceededError("no headroom") - ) - (downloader, log_records, _writer, _enforcer, _sleeps) = _build_downloader( + enforcer = _StubBudgetEnforcer(raise_on_call=CacheBudgetExceededError("no headroom")) + (downloader, _log_records, _writer, _enforcer, _sleeps) = _build_downloader( transport=transport, budget_enforcer=enforcer ) @@ -606,16 +591,12 @@ def test_ac11_service_api_key_never_appears_in_logs(tmp_path: Path) -> None: inventory_response_override=httpx.Response(401), ) ) - (downloader, log_records, _writer, _enforcer, _sleeps) = _build_downloader( - transport=transport - ) + (downloader, log_records, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport) with pytest.raises(SatelliteProviderError): downloader.download_tiles_for_area(_make_request(cache_root=tmp_path)) # Assert - flat = " ".join( - r.getMessage() + json.dumps(getattr(r, "kv", {})) for r in log_records - ) + flat = " ".join(r.getMessage() + json.dumps(getattr(r, "kv", {})) for r in log_records) assert _API_KEY not in flat assert "Bearer ***" in flat @@ -642,9 +623,7 @@ def test_ac12_partial_journal_resumed_on_rerun(tmp_path: Path) -> None: tile_response_factory=_factory, ) ) - (downloader, _logs, writer, _enforcer, _sleeps) = _build_downloader( - transport=transport - ) + (downloader, _logs, writer, _enforcer, _sleeps) = _build_downloader(transport=transport) request = _make_request(cache_root=tmp_path) # First run — completes @@ -771,14 +750,10 @@ def test_nfr_throughput_1000_tiles_under_budget(tmp_path: Path) -> None: transport = httpx.MockTransport( _make_inventory_handler( present_count=1000, - tile_response_factory=lambda r: httpx.Response( - 200, content=b"\xff\xd8tile" - ), + tile_response_factory=lambda r: httpx.Response(200, content=b"\xff\xd8tile"), ) ) - (downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader( - transport=transport - ) + (downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport) # Big bbox at zoom 14: ~ 32x32 tile span on this latitude is enough. # 1° ≈ 45 tiles at zoom 14 in latitude → 0.75° gives ≈ 33 tiles → ~1089 tiles.