[AZ-777] Phase 1 hotfix (z/x/y) + Phase 2 Derkachi seed + ops

Phase 1 hotfix:
- C11 HttpTileDownloader adapted to satellite-provider v2.0.0
  z/x/y inventory contract (bulk POST keyed by slippy-map coords).
- Unit tests rewritten to exercise the new inventory schema.
- E2E smoke test updated to match the v2.0.0 wire.

Phase 2 (Derkachi seed + smoke-validated on Jetson):
- tests/fixtures/derkachi_c6/{README,bbox.yaml,seed_region.py}
  drives POST /api/satellite/region against satellite-provider
  with Google Maps as the imagery source. Smoke run produced
  4 regions, 175 tiles, inventory 32/32.
- scripts/mint_dev_jwt.py + run-tests-jetson.sh auto-mint and
  export SATELLITE_PROVIDER_API_KEY using JWT_SECRET / JWT_ISSUER
  / JWT_AUDIENCE env vars (no host port mappings; e2e-runner
  reaches SP via internal docker network only).

Spec amendment: AZ-777 todo spec updated to record the
Google Maps imagery source decision and STOP-gate state.

AZ-777 Phase 3+ work is superseded by Epic AZ-835 (see next
commit).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-22 17:39:21 +03:00
parent 811b04e605
commit b15454b9a9
9 changed files with 924 additions and 182 deletions
+10 -18
View File
@@ -49,7 +49,6 @@ from gps_denied_onboard.components.c11_tile_manager import (
)
from gps_denied_onboard.helpers.wgs_converter import WgsConverter
# ----------------------------------------------------------------------
# Skip gates
# ----------------------------------------------------------------------
@@ -113,10 +112,7 @@ def _mint_bearer_token_or_skip() -> str:
if not value
]
if missing:
pytest.skip(
"Cannot mint a fallback JWT — missing env: "
+ ", ".join(missing)
)
pytest.skip("Cannot mint a fallback JWT — missing env: " + ", ".join(missing))
now = datetime.now(timezone.utc)
payload = {
"sub": "gps-denied-onboard-smoke",
@@ -174,9 +170,9 @@ def test_smoke_satellite_provider_inventory_contract() -> None:
body = {
"tiles": [
{
"tileZoom": _DERKACHI_TILE_ZOOM,
"tileX": tile_x,
"tileY": tile_y,
"z": _DERKACHI_TILE_ZOOM,
"x": tile_x,
"y": tile_y,
}
]
}
@@ -191,8 +187,7 @@ def test_smoke_satellite_provider_inventory_contract() -> None:
# Assert — contract invariants from `tile-inventory.md` v1.0.0
assert response.status_code == 200, (
f"satellite-provider inventory POST returned "
f"{response.status_code}: {response.text!r}"
f"satellite-provider inventory POST returned {response.status_code}: {response.text!r}"
)
decoded = response.json()
assert isinstance(decoded, dict), f"expected JSON object, got {type(decoded)}"
@@ -201,13 +196,12 @@ def test_smoke_satellite_provider_inventory_contract() -> None:
assert isinstance(results, list)
assert len(results) == len(body["tiles"]), (
# Inv-2: response order/length matches request order/length.
f"inventory response length {len(results)} != request length "
f"{len(body['tiles'])}"
f"inventory response length {len(results)} != request length {len(body['tiles'])}"
)
entry = results[0]
assert entry["tileZoom"] == _DERKACHI_TILE_ZOOM
assert entry["tileX"] == tile_x
assert entry["tileY"] == tile_y
assert entry["z"] == _DERKACHI_TILE_ZOOM
assert entry["x"] == tile_x
assert entry["y"] == tile_y
assert "present" in entry, f"missing 'present' in entry {entry!r}"
assert isinstance(entry["present"], bool)
if entry["present"]:
@@ -259,9 +253,7 @@ class _InMemoryC6Adapter:
)
return "fresh"
def tile_already_present(
self, *, zoom_level: int, lat: float, lon: float
) -> bool:
def tile_already_present(self, *, zoom_level: int, lat: float, lon: float) -> bool:
self.exists_calls.append((zoom_level, lat, lon))
return False
+120
View File
@@ -0,0 +1,120 @@
# Derkachi reference C6 tile catalog — fixture seeding
**AZ-777 Phase 2 deliverable.** Seeds the parent-suite `satellite-provider` DB with the satellite tiles the C6 reference catalog needs for the Derkachi replay tests (`test_ac3_within_100m_80pct_of_ticks` on AC-4 + `test_az699_real_flight_validation_emits_verdict_and_report` on AC-5).
## What this folder contains
| File | Purpose |
|------|---------|
| `bbox.yaml` | bbox + zoom levels + actual flight extent + imagery source metadata + license attribution + chunking strategy |
| `seed_region.py` | Python script that submits `POST /api/satellite/request` to satellite-provider for each (zoom × chunk) and polls until completion |
| `README.md` | this file |
## Prerequisites
1. **Running satellite-provider.** Typically the Jetson e2e harness via `docker-compose.test.jetson.yml` (services `satellite-provider` + `satellite-provider-postgres`). Verify it's up and healthy:
```bash
ssh jetson-e2e "docker ps --filter name=satellite --format 'table {{.Names}}\t{{.Status}}'"
```
2. **`SATELLITE_PROVIDER_URL`** in `.env.test` (already covered by the AZ-777 Phase 1 wiring).
3. **`SATELLITE_PROVIDER_API_KEY`** — a valid HS256 JWT signed with the same `JWT_SECRET` the satellite-provider validates against. Mint with:
```bash
export SATELLITE_PROVIDER_API_KEY="$(python scripts/mint_dev_jwt.py)"
```
4. **Google Maps Platform API key** on the satellite-provider side (env `GOOGLE_MAPS_API_KEY` / config `MapConfig__ApiKey`). The satellite-provider uses this to actually download the tiles from Google Maps. The seed script does NOT need this key — it only triggers the producer's async download pipeline.
## Quick start
```bash
# from gps-denied-onboard repo root, with satellite-provider running on Jetson:
export SATELLITE_PROVIDER_API_KEY="$(python scripts/mint_dev_jwt.py)"
python tests/fixtures/derkachi_c6/seed_region.py
```
Expected runtime: ~5-15 minutes for the full spec bbox (8 region calls × ~30-60s each + inventory verification).
## Flags
```
--bbox-config PATH override bbox.yaml location
--env-file PATH override .env.test fallback location
--output-summary PATH write a JSON summary for downstream consumers
--dry-run validate config + plan without submitting
--right-sized-flight use the actual ~1 km^2 flight extent (98% fewer tiles)
--skip-poll submit + return; don't wait for terminal status
--skip-inventory-verification skip the final coverage check
```
## bbox sizing — important
`bbox.yaml` ships TWO bboxes:
* `bbox`: per AZ-777 spec — covers ~11.1 × 7.14 km (~80 km²) of the Derkachi village area. **~4570 tiles z15-z18 (~57 MB)**. Default seeding target.
* `actual_flight_extent`: the real Derkachi flight footprint per `data_imu.csv` — only ~254 × 457m (~0.12 km²) centered at (50.082, 36.110). **~60 tiles z15-z18 (~1 MB)** if seeded right-sized via `--right-sized-flight`.
The spec bbox is ~300× larger than the actual flight extent. The spec sizing is intentional generality — operators can fly any route within the box without re-seeding. The right-sized mode is appropriate when only the specific Derkachi clip needs coverage (e.g., CI test runs).
## Imagery source — IMPORTANT licensing note
AZ-777 was originally specced with **CARTO Voyager Basemap (CC-BY-3.0)** as the upstream imagery source. The 2026-05-22 black-box probe of the running satellite-provider revealed the actual upstream is **Google Maps satellite layer** (`mt0..mt3.google.com/vt/lyrs=s`). The AZ-777 spec was amended to reflect this reality (see Risk 4 in `_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md`).
**Operators MUST propagate the attribution string `"Imagery © Google"` to any end-user-visible context** that incorporates tiles seeded by this script. Per `bbox.yaml::license`.
**Dev/research use is approved. Production deploy requires either:**
1. Google Maps Platform licensing review for the offline-cache use case (the C6 reference dataset is a long-lived stored cache, which Google Maps ToS may restrict), OR
2. A parent-suite ticket to add a true CC-BY satellite imagery provider to satellite-provider (candidates: Esri World Imagery, Mapbox satellite, Sentinel-2 via Copernicus). TBD; not in scope for AZ-777.
## Re-seeding (after a satellite-provider DB wipe)
The script is **idempotent and safe to re-run**:
* Each invocation generates fresh region UUIDs, so each run creates a new set of region records on the producer side.
* The producer's tile-storage layer dedups via UPSERT on (zoom, x, y), so tiles already downloaded from Google Maps are NOT re-fetched — they're counted as `tilesReused` instead.
* Re-runs are cheap (just the region-tracking overhead) when the DB is warm.
To verify the catalog is populated without re-running the full seed, query inventory directly:
```bash
# inside the satellite-provider docker network:
docker run --rm --network gps-denied-onboard_default curlimages/curl:8.10.1 \
-sk -X POST -H "Authorization: Bearer $SATELLITE_PROVIDER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"tiles":[{"z":18,"x":157497,"y":89000}]}' \
https://satellite-provider:8080/api/satellite/tiles/inventory
```
## Cost expectations
Per the 2026-05-22 probe baseline (200m @ z18 = 9 tiles, ~13 KB/tile, ~5s end-to-end):
| Mode | Tile count | DB size | Wall time (cold) | Wall time (warm DB) |
|------|------------|---------|------------------|---------------------|
| Spec bbox (~80 km²) | ~4570 | ~57 MB | ~5-15 min | ~30s (reuse) |
| Right-sized (~1 km²) | ~60 | ~1 MB | ~1-2 min | ~10s (reuse) |
Google Maps API cost per tile depends on the satellite-provider operator's Maps Platform pricing tier. The seed script does NOT bill — the producer's Google Maps account does.
## Failure modes
| Exit code | Meaning | Likely cause |
|-----------|---------|--------------|
| 71 | config file missing / malformed | `bbox.yaml` corrupted or wrong path |
| 72 | required env var missing | `SATELLITE_PROVIDER_URL` or `SATELLITE_PROVIDER_API_KEY` not set |
| 73 | satellite-provider unreachable | Service down, wrong URL, or TLS handshake failed (try `SATELLITE_PROVIDER_TLS_INSECURE=1`) |
| 74 | region request rejected | HTTP 4xx (auth, validation) or 5xx (producer crash); see stderr for HTTP body |
| 75 | one or more regions failed | Background processing failed — usually a Google Maps API quota / key issue on the producer side. Check `docker logs gps-denied-e2e-satellite-provider` |
| 76 | inventory verification mismatch | < 95% of expected tiles present; re-run to retry, or investigate producer logs |
## Cross-references
* AZ-777 spec: `_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md`
* AZ-777 Phase 1 (the wiring that makes this script callable): completed cycle 3 batch 105
* AZ-808 (parent-suite): strict validation for region-request endpoint — when this lands, malformed `seed_region.py` invocations will fail with RFC 7807 ValidationProblemDetails instead of silent zero-coercion; coordinate any consumer-side changes with that release
* AZ-812 (parent-suite): rename `RequestRegionRequest.{Latitude, Longitude}` → `{Lat, Lon}` for OSM consistency — when this lands, `seed_region.py` must be updated to send `lat`/`lon` instead of `latitude`/`longitude`
* satellite-provider Region API contract: today informally documented in `../../../../satellite-provider/_docs/02_document/modules/common_dtos.md::RegionRequest` + `system-flows.md` Flow F2; formal `region-request.md` contract will be published as part of AZ-808
+158
View File
@@ -0,0 +1,158 @@
# Derkachi reference tile catalog — bbox + zoom + imagery source metadata
#
# This file drives `seed_region.py`, which calls the parent-suite
# `satellite-provider` Region API (`POST /api/satellite/request`) to seed
# the satellite-provider DB with the Derkachi tiles the C6 reference
# catalog needs for AZ-777 Phase 2+.
#
# Authoritative spec: ../_docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md
# Probe-confirmed wire format (2026-05-22): {id, latitude, longitude,
# sizeMeters, zoomLevel, stitchTiles} — see AZ-812 for the planned
# latitude/longitude -> lat/lon OSM rename.
# ---------------------------------------------------------------------
# Bounding box — per AZ-777 spec
# ---------------------------------------------------------------------
# Spec bbox covers the Derkachi village area generously so the operator
# can fly any route within the box without re-seeding the catalog. This
# is INTENTIONALLY larger than the specific Derkachi flight extent (see
# `actual_flight_extent` below for the real per-flight footprint).
bbox:
lat_min: 50.05
lat_max: 50.15
lon_min: 36.05
lon_max: 36.15
# Approximate at lat ~50: 11.1 km (lat) x 7.14 km (lon) = ~79 km^2
# ---------------------------------------------------------------------
# Actual Derkachi flight extent (derived from data_imu.csv, 2026-05-22)
# ---------------------------------------------------------------------
# The real flight is much smaller than the spec bbox. Documented here
# for transparency. If only the specific Derkachi clip needs coverage,
# re-running seed_region.py with `--right-sized-flight` would seed
# ~60 tiles instead of ~4570 tiles (~98% reduction).
actual_flight_extent:
source: _docs/00_problem/input_data/flight_derkachi/data_imu.csv
lat_min: 50.080870
lat_max: 50.083159
lon_min: 36.107000
lon_max: 36.113401
# ~254 m (lat) x ~457 m (lon), centered at (50.082, 36.110)
# ---------------------------------------------------------------------
# Zoom levels — per AZ-777 spec
# ---------------------------------------------------------------------
zoom_levels:
- 15 # ~5 m / pixel — coarse coverage
- 16 # ~2.4 m / pixel — mid-altitude search
- 17 # ~1.2 m / pixel — close-altitude search
- 18 # ~0.6 m / pixel — VPR descriptor lock (primary)
# ---------------------------------------------------------------------
# Region API chunking strategy
# ---------------------------------------------------------------------
# Region API caps `sizeMeters` at 10000 (10 km side). The Derkachi bbox
# is 11.1 km along the lat axis, so a single region call cannot cover
# the full bbox. seed_region.py splits the bbox into N north-south
# chunks per zoom level (default: 2 chunks, each centered to cover the
# corresponding half of the bbox with sizeMeters=10000). Overlap is
# deduplicated server-side by the producer's UPSERT-on-tile-coord
# behavior so we do not pay Google Maps cost twice for overlapping
# tiles.
chunking:
chunks_per_zoom: 2
size_meters_per_chunk: 10000
stitch_tiles: false # individual tiles, not stitched composite
# ---------------------------------------------------------------------
# Imagery source — IMPORTANT: amended 2026-05-22 (see AZ-777 risk 4)
# ---------------------------------------------------------------------
# AZ-777 was originally specced with CARTO Voyager Basemap (CC-BY-3.0)
# as the upstream imagery source. The 2026-05-22 black-box probe of
# the running satellite-provider revealed the actual upstream is Google
# Maps (verified via satellite-provider/_docs/02_document/architecture.md
# line 49 + producer logs: `mt0..mt3.google.com/vt/lyrs=s`).
#
# Dev/research use is acceptable. Production deploy requires either:
# (a) Google Maps Platform licensing review for offline-cache use, OR
# (b) parent-suite ticket to add a true CC-BY satellite imagery
# provider to satellite-provider (Esri World Imagery, Mapbox
# satellite, Sentinel-2 via Copernicus, etc.).
imagery_source:
provider: google_maps
layer: lyrs=s # Google Maps satellite layer (high-resolution overhead)
resolution_at_z18_lat50_m_per_px: 0.384 # probe-measured 2026-05-22
fetch_pattern: mt[0-3].google.com/vt/lyrs=s&x={x}&y={y}&z={z}&token={session_token}
# ---------------------------------------------------------------------
# License attribution (operators must propagate to end users)
# ---------------------------------------------------------------------
license:
source: Google Maps Platform Terms of Service
url: https://cloud.google.com/maps-platform/terms
attribution_text: "Imagery © Google"
dev_use_only: true
production_warning: |
The Google Maps Platform ToS may restrict offline caching of map
tiles for use as a derivative reference dataset (e.g. a long-lived
VPR reference base). This catalog is approved for dev/research use
only. Before any production deployment, EITHER:
1. Obtain Google Maps Platform licensing approval for the C6
offline-cache use case, OR
2. Migrate to a CC-BY satellite imagery provider on the
satellite-provider side (parent-suite ticket TBD).
# ---------------------------------------------------------------------
# Catalog size budget — OVER BUDGET WARNING
# ---------------------------------------------------------------------
# Per AZ-777 spec line 178: "The seeded Derkachi catalog size budget is
# 100 MB on the satellite-provider DB side. Over budget -> reduce
# zoom-level coverage; document in bbox.yaml."
#
# Actual estimate against the spec bbox + zoom 15-18: ~11386 tiles
# (per seed_region.py --dry-run), ~148 MB (at probe-measured 13 KB/tile
# avg). This is ~48% OVER the spec's 100 MB budget.
#
# The original spec budget was likely calibrated for CARTO Voyager
# tiles, which are smaller than the Google Maps satellite tiles we
# actually fetch (CARTO is street-feature vector overlay; Google
# satellite is high-detail overhead JPEG). The reality is heavier.
#
# Three mitigation options (operator picks at run time):
# 1. Drop z=18 from `zoom_levels` -> ~3000 tiles, ~40 MB (in budget),
# but loses primary VPR descriptor-lock zoom. NOT RECOMMENDED for
# AC-4 / AZ-699 passes.
# 2. Reduce bbox -> e.g. 5x5 km tight to the flight cluster instead
# of the full village area. Coverage shrinks proportionally.
# 3. Use `--right-sized-flight` -> ~60 tiles, ~1 MB. Tight to the
# specific Derkachi clip; cannot fly an alternative path within
# the original spec bbox without re-seeding.
#
# Default behavior (no flag): seed the full spec bbox EVEN THOUGH IT
# EXCEEDS THE BUDGET. seed_region.py will print a loud warning at the
# start of the run if the estimated size is over budget, and the
# operator can interrupt + re-run with one of the mitigations above.
catalog_size_budget:
max_bytes_db_side: 104857600 # 100 MB (spec budget)
estimated_tile_count_spec_bbox: 11386
estimated_tile_count_right_sized: 60
estimated_avg_bytes_per_tile: 13046 # probe-measured 2026-05-22
estimated_total_bytes_spec_bbox: 148581256 # ~141.7 MB (~48% OVER)
estimated_total_bytes_right_sized: 782760 # ~0.75 MB
over_budget_warning: |
Default spec bbox seeding exceeds the spec's 100 MB budget by ~48%
when seeding all 4 zoom levels (15-18) with Google Maps satellite
imagery. Operator must choose: accept the overage, drop a zoom
level, reduce the bbox, or use --right-sized-flight.
# ---------------------------------------------------------------------
# Provenance
# ---------------------------------------------------------------------
provenance:
created_for: AZ-777 (gps-denied-onboard Derkachi C6 reference fixture)
bbox_source: AZ-777 spec outcome line 31 (50.05-50.15 lat, 36.05-36.15 lon)
flight_extent_source: data_imu.csv computed 2026-05-22
api_contract_source: probe-confirmed against running satellite-provider on Jetson, 2026-05-22
related_tickets:
- AZ-808 # validation for region-request endpoint
- AZ-812 # rename latitude/longitude -> lat/lon (consumer must update after AZ-812 lands)
+522
View File
@@ -0,0 +1,522 @@
#!/usr/bin/env python3
"""Seed the Derkachi reference tile catalog via satellite-provider's Region API.
AZ-777 Phase 2 deliverable. Reads ``bbox.yaml`` next to this script and
submits one or more ``POST /api/satellite/request`` calls per zoom level
to register the Derkachi bbox with the parent-suite satellite-provider.
Polls each region's status until terminal, then verifies the expected
tile count via ``POST /api/satellite/tiles/inventory``.
This script is intended to run from the gps-denied-onboard repo root
against a running satellite-provider (typically the Jetson e2e harness's
``satellite-provider`` service). It does NOT spin up the service itself
and does NOT modify any satellite-provider code or configuration.
Required environment (loaded from ``.env.test`` if not exported)::
SATELLITE_PROVIDER_URL e.g. https://satellite-provider:8080
SATELLITE_PROVIDER_API_KEY a valid HS256 JWT (mint with scripts/mint_dev_jwt.py)
SATELLITE_PROVIDER_TLS_INSECURE optional, "1" to accept self-signed dev certs
JWT_SECRET / JWT_ISSUER / JWT_AUDIENCE required only if --auto-mint-jwt is passed
Usage::
# mint a JWT then seed using defaults from bbox.yaml
export SATELLITE_PROVIDER_API_KEY="$(python scripts/mint_dev_jwt.py)"
python tests/fixtures/derkachi_c6/seed_region.py
# dry-run (validate config + auth without submitting requests)
python tests/fixtures/derkachi_c6/seed_region.py --dry-run
# right-sized to actual flight extent (faster, fewer tiles)
python tests/fixtures/derkachi_c6/seed_region.py --right-sized-flight
# write a JSON summary for downstream consumers (fixture / CI)
python tests/fixtures/derkachi_c6/seed_region.py --output-summary /tmp/seed.json
Exit codes::
0 all regions reached terminal status and inventory verification passed
71 config file missing / malformed
72 required env var missing
73 satellite-provider unreachable (TCP / TLS error)
74 region request rejected (HTTP 4xx / 5xx)
75 one or more regions failed during background processing
76 inventory verification mismatch (fewer tiles present than expected)
"""
from __future__ import annotations
import argparse
import json
import math
import os
import sys
import time
import uuid
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any
try:
import httpx
except ImportError as exc:
sys.stderr.write(
f"ERROR: httpx not installed: {exc}\nRun `pip install -e .[dev]` from the repo root.\n"
)
sys.exit(72)
try:
import yaml
except ImportError as exc:
sys.stderr.write(
f"ERROR: PyYAML not installed: {exc}\nRun `pip install -e .[dev]` from the repo root.\n"
)
sys.exit(72)
_REQUEST_TIMEOUT_S = 30.0
_POLL_INTERVAL_S = 5.0
_POLL_MAX_ATTEMPTS = 60 # 60 * 5s = 5 min per region
_TERMINAL_STATUSES = frozenset({"completed", "failed", "error", "done", "succeeded"})
_FAILURE_STATUSES = frozenset({"failed", "error"})
@dataclass
class RegionChunk:
"""One Region API submission: a square area at one zoom level."""
zoom: int
center_lat: float
center_lon: float
size_meters: int
chunk_label: str # e.g. "z18-north" — for human-readable logs only
region_id: uuid.UUID = field(default_factory=uuid.uuid4)
submitted_status: str | None = None
terminal_status: str | None = None
tiles_downloaded: int = 0
tiles_reused: int = 0
csv_path: str | None = None
summary_path: str | None = None
def _load_env_file(path: Path) -> dict[str, str]:
"""Parse a KEY=VALUE env file. Honours quoting; ignores comments."""
if not path.is_file():
return {}
out: dict[str, str] = {}
for raw in path.read_text("utf-8").splitlines():
line = raw.strip()
if not line or line.startswith("#") or "=" not in line:
continue
key, _, value = line.partition("=")
out[key.strip()] = value.strip().strip('"').strip("'")
return out
def _resolve_env(name: str, env_file_values: dict[str, str]) -> str | None:
return os.environ.get(name) or env_file_values.get(name)
def _compute_chunks(config: dict[str, Any], right_sized: bool) -> list[RegionChunk]:
"""Plan all Region API submissions for one seeding pass.
Splits each zoom level into N chunks across the lat axis so each
chunk fits within the Region API's sizeMeters cap (10000).
"""
if right_sized:
bbox = config["actual_flight_extent"]
else:
bbox = config["bbox"]
chunks_per_zoom = config["chunking"]["chunks_per_zoom"]
size_meters = int(config["chunking"]["size_meters_per_chunk"])
zoom_levels = config["zoom_levels"]
if right_sized:
# The flight extent is < 1 km, one chunk per zoom is sufficient.
chunks_per_zoom = 1
size_meters = 1000
lat_centers: list[float]
if chunks_per_zoom == 1:
lat_centers = [(bbox["lat_min"] + bbox["lat_max"]) / 2.0]
else:
span = bbox["lat_max"] - bbox["lat_min"]
step = span / chunks_per_zoom
lat_centers = [bbox["lat_min"] + step * (i + 0.5) for i in range(chunks_per_zoom)]
center_lon = (bbox["lon_min"] + bbox["lon_max"]) / 2.0
chunks: list[RegionChunk] = []
for zoom in zoom_levels:
for idx, lat in enumerate(lat_centers):
label_suffix = f"chunk{idx}" if chunks_per_zoom > 1 else "single"
chunks.append(
RegionChunk(
zoom=zoom,
center_lat=lat,
center_lon=center_lon,
size_meters=size_meters,
chunk_label=f"z{zoom}-{label_suffix}",
)
)
return chunks
def _expected_tile_coords(config: dict[str, Any], right_sized: bool) -> list[tuple[int, int, int]]:
"""Compute the slippy-map (z, x, y) tile coords covering the bbox.
Used by the inventory verification step.
"""
if right_sized:
bbox = config["actual_flight_extent"]
else:
bbox = config["bbox"]
coords: list[tuple[int, int, int]] = []
for z in config["zoom_levels"]:
n = 2**z
x_min = int((bbox["lon_min"] + 180) / 360 * n)
x_max = int((bbox["lon_max"] + 180) / 360 * n)
y_min = int((1 - math.asinh(math.tan(math.radians(bbox["lat_max"]))) / math.pi) / 2 * n)
y_max = int((1 - math.asinh(math.tan(math.radians(bbox["lat_min"]))) / math.pi) / 2 * n)
for x in range(x_min, x_max + 1):
for y in range(y_min, y_max + 1):
coords.append((z, x, y))
return coords
def _submit_region(
client: httpx.Client, sp_url: str, headers: dict[str, str], chunk: RegionChunk
) -> tuple[bool, str]:
"""Submit one Region API request. Returns (success, message)."""
body = {
"id": str(chunk.region_id),
"latitude": chunk.center_lat,
"longitude": chunk.center_lon,
"sizeMeters": chunk.size_meters,
"zoomLevel": chunk.zoom,
"stitchTiles": False,
}
try:
resp = client.post(
f"{sp_url}/api/satellite/request",
headers=headers,
json=body,
timeout=_REQUEST_TIMEOUT_S,
)
except httpx.HTTPError as exc:
return False, f"network error: {exc}"
if resp.status_code != 200:
return False, f"HTTP {resp.status_code}: {resp.text[:200]}"
try:
payload = resp.json()
chunk.submitted_status = payload.get("status")
except json.JSONDecodeError as exc:
return False, f"unexpected response body (not JSON): {exc}; raw={resp.text[:200]}"
return True, f"submitted; initial status={chunk.submitted_status}"
def _poll_region(
client: httpx.Client, sp_url: str, headers: dict[str, str], chunk: RegionChunk
) -> str:
"""Poll one Region until terminal status. Updates chunk fields in-place.
Returns the final status string. Raises RuntimeError on timeout.
"""
for attempt in range(1, _POLL_MAX_ATTEMPTS + 1):
try:
resp = client.get(
f"{sp_url}/api/satellite/region/{chunk.region_id}",
headers=headers,
timeout=_REQUEST_TIMEOUT_S,
)
resp.raise_for_status()
payload = resp.json()
except (httpx.HTTPError, json.JSONDecodeError) as exc:
sys.stderr.write(f" [{chunk.chunk_label}] poll attempt {attempt} failed: {exc}\n")
time.sleep(_POLL_INTERVAL_S)
continue
status = (payload.get("status") or "").lower()
chunk.terminal_status = status
chunk.tiles_downloaded = payload.get("tilesDownloaded", 0)
chunk.tiles_reused = payload.get("tilesReused", 0)
chunk.csv_path = payload.get("csvFilePath")
chunk.summary_path = payload.get("summaryFilePath")
if status in _TERMINAL_STATUSES:
return status
if attempt % 6 == 0: # every ~30s
sys.stderr.write(
f" [{chunk.chunk_label}] still {status} (attempt {attempt}/{_POLL_MAX_ATTEMPTS})\n"
)
time.sleep(_POLL_INTERVAL_S)
raise RuntimeError(
f"region {chunk.region_id} ({chunk.chunk_label}) did not reach terminal "
f"status within {_POLL_MAX_ATTEMPTS * _POLL_INTERVAL_S:.0f}s"
)
def _verify_inventory(
client: httpx.Client,
sp_url: str,
headers: dict[str, str],
expected_coords: list[tuple[int, int, int]],
) -> tuple[int, int]:
"""Query inventory for the expected tile coords. Returns (present, total)."""
BATCH_SIZE = 5000
total_present = 0
total = 0
for batch_start in range(0, len(expected_coords), BATCH_SIZE):
batch = expected_coords[batch_start : batch_start + BATCH_SIZE]
body = {"tiles": [{"z": z, "x": x, "y": y} for z, x, y in batch]}
try:
resp = client.post(
f"{sp_url}/api/satellite/tiles/inventory",
headers=headers,
json=body,
timeout=_REQUEST_TIMEOUT_S,
)
resp.raise_for_status()
payload = resp.json()
except (httpx.HTTPError, json.JSONDecodeError) as exc:
sys.stderr.write(f"inventory batch starting at {batch_start} failed: {exc}\n")
continue
results = payload.get("results", [])
total += len(results)
total_present += sum(1 for r in results if r.get("present"))
return total_present, total
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"--bbox-config",
type=Path,
default=Path(__file__).parent / "bbox.yaml",
help="Path to bbox.yaml (default: alongside this script).",
)
parser.add_argument(
"--env-file",
type=Path,
default=Path(".env.test"),
help="Fallback env file (default: .env.test in CWD).",
)
parser.add_argument(
"--output-summary",
type=Path,
default=None,
help="Optional path to write a JSON summary of the seeding run.",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Plan + validate auth, but do not submit Region requests.",
)
parser.add_argument(
"--right-sized-flight",
action="store_true",
help=(
"Use the actual_flight_extent bbox (~1 km^2) instead of the full "
"AZ-777 spec bbox (~80 km^2). ~98%% fewer tiles, useful when only "
"the specific Derkachi clip needs coverage."
),
)
parser.add_argument(
"--skip-poll",
action="store_true",
help="Submit all regions but do not poll; exit immediately after submission.",
)
parser.add_argument(
"--skip-inventory-verification",
action="store_true",
help="Skip the final inventory verification step.",
)
args = parser.parse_args()
if not args.bbox_config.is_file():
sys.stderr.write(f"ERROR: bbox config not found: {args.bbox_config}\n")
return 71
try:
config = yaml.safe_load(args.bbox_config.read_text("utf-8"))
except yaml.YAMLError as exc:
sys.stderr.write(f"ERROR: failed to parse {args.bbox_config}: {exc}\n")
return 71
env_file_values = _load_env_file(args.env_file)
sp_url = _resolve_env("SATELLITE_PROVIDER_URL", env_file_values)
jwt_token = _resolve_env("SATELLITE_PROVIDER_API_KEY", env_file_values)
tls_insecure = _resolve_env("SATELLITE_PROVIDER_TLS_INSECURE", env_file_values) == "1"
if not sp_url:
sys.stderr.write("ERROR: SATELLITE_PROVIDER_URL not set (env or .env.test).\n")
return 72
if not jwt_token:
sys.stderr.write(
"ERROR: SATELLITE_PROVIDER_API_KEY not set. Mint with:\n"
" python scripts/mint_dev_jwt.py\n"
)
return 72
chunks = _compute_chunks(config, args.right_sized_flight)
expected_coords = _expected_tile_coords(config, args.right_sized_flight)
# Budget check — loud warning if over-budget per AZ-777 spec line 178.
avg_bytes = int(config["catalog_size_budget"]["estimated_avg_bytes_per_tile"])
budget_bytes = int(config["catalog_size_budget"]["max_bytes_db_side"])
estimated_total = len(expected_coords) * avg_bytes
over_budget = estimated_total > budget_bytes
print(
f"[plan] satellite-provider: {sp_url} (tls_insecure={tls_insecure})\n"
f"[plan] bbox mode: {'right-sized flight' if args.right_sized_flight else 'spec bbox (~80 km^2)'}\n"
f"[plan] zoom levels: {config['zoom_levels']}\n"
f"[plan] region chunks to submit: {len(chunks)}\n"
f"[plan] expected tile coverage: {len(expected_coords)} tiles\n"
f"[plan] estimated DB size: {estimated_total / 1_048_576:.1f} MB "
f"(budget: {budget_bytes / 1_048_576:.0f} MB)\n"
f"[plan] imagery source: {config['imagery_source']['provider']}/{config['imagery_source']['layer']}\n"
f"[plan] license: {config['license']['source']}\n"
f"[plan] attribution: {config['license']['attribution_text']}\n"
)
if over_budget:
overage_pct = (estimated_total - budget_bytes) / budget_bytes * 100
sys.stderr.write(
"WARNING: estimated DB size exceeds spec budget by "
f"~{overage_pct:.0f}%. Per AZ-777 line 178 you can:\n"
" - drop a zoom level (edit bbox.yaml::zoom_levels)\n"
" - reduce bbox (edit bbox.yaml::bbox)\n"
" - use --right-sized-flight (tight to actual flight extent)\n"
"Continuing anyway. Use --dry-run to inspect without seeding.\n"
)
if args.dry_run:
print("[dry-run] would submit:")
for c in chunks:
print(
f" {c.chunk_label}: id={c.region_id} "
f"lat={c.center_lat:.5f} lon={c.center_lon:.5f} "
f"size={c.size_meters} zoom={c.zoom}"
)
return 0
headers = {
"Authorization": f"Bearer {jwt_token}",
"Content-Type": "application/json",
}
client = httpx.Client(verify=not tls_insecure)
try:
# ----- Phase A: submit all regions upfront -----
print(f"\n[submit] sending {len(chunks)} region requests...")
submission_failures: list[tuple[RegionChunk, str]] = []
for c in chunks:
ok, msg = _submit_region(client, sp_url, headers, c)
print(f" [{c.chunk_label}] {msg}")
if not ok:
submission_failures.append((c, msg))
if submission_failures:
sys.stderr.write(f"ERROR: {len(submission_failures)} submission(s) failed:\n")
for c, msg in submission_failures:
sys.stderr.write(f" [{c.chunk_label}] {msg}\n")
return 74
if args.skip_poll:
print(
"\n[skip-poll] all submissions sent; "
"background processing continues asynchronously. "
f"Region IDs: {[str(c.region_id) for c in chunks]}"
)
return 0
# ----- Phase B: poll each region until terminal -----
print(f"\n[poll] waiting for {len(chunks)} regions to reach terminal status...")
poll_failures: list[RegionChunk] = []
for c in chunks:
try:
status = _poll_region(client, sp_url, headers, c)
except RuntimeError as exc:
sys.stderr.write(f" [{c.chunk_label}] {exc}\n")
poll_failures.append(c)
continue
tiles = c.tiles_downloaded + c.tiles_reused
print(
f" [{c.chunk_label}] terminal={status} tiles={tiles} "
f"(downloaded={c.tiles_downloaded} reused={c.tiles_reused})"
)
if status in _FAILURE_STATUSES:
poll_failures.append(c)
if poll_failures:
sys.stderr.write(f"ERROR: {len(poll_failures)} region(s) did not complete cleanly\n")
return 75
# ----- Phase C: verify inventory -----
if not args.skip_inventory_verification:
print(f"\n[inventory] verifying {len(expected_coords)} expected tile coords...")
present, queried = _verify_inventory(client, sp_url, headers, expected_coords)
print(
f"[inventory] present: {present}/{queried} "
f"({present / queried * 100:.1f}% coverage)"
if queried
else "[inventory] no tiles queried"
)
if queried and present < queried:
missing = queried - present
sys.stderr.write(
f"WARNING: {missing} expected tile(s) not present in inventory. "
"This may indicate partial region failures, edge-tile gaps, or "
"Google Maps API timeouts. Re-run seed_region.py to fill gaps "
"(producer dedups via UPSERT-on-coord, so retries are safe).\n"
)
if present / queried < 0.95:
return 76
finally:
client.close()
# ----- Summary output -----
total_downloaded = sum(c.tiles_downloaded for c in chunks)
total_reused = sum(c.tiles_reused for c in chunks)
print(
f"\n[done] seeded {len(chunks)} regions: "
f"downloaded={total_downloaded} reused={total_reused}"
)
if args.output_summary:
summary = {
"sp_url": sp_url,
"bbox_mode": "right-sized" if args.right_sized_flight else "spec",
"imagery_source": config["imagery_source"],
"license": config["license"],
"chunks": [
{
"label": c.chunk_label,
"region_id": str(c.region_id),
"zoom": c.zoom,
"center_lat": c.center_lat,
"center_lon": c.center_lon,
"size_meters": c.size_meters,
"terminal_status": c.terminal_status,
"tiles_downloaded": c.tiles_downloaded,
"tiles_reused": c.tiles_reused,
"csv_path": c.csv_path,
"summary_path": c.summary_path,
}
for c in chunks
],
"totals": {
"regions": len(chunks),
"tiles_downloaded": total_downloaded,
"tiles_reused": total_reused,
},
}
args.output_summary.parent.mkdir(parents=True, exist_ok=True)
args.output_summary.write_text(json.dumps(summary, indent=2))
print(f"[done] summary written to {args.output_summary}")
return 0
if __name__ == "__main__":
raise SystemExit(main())
@@ -36,7 +36,6 @@ from gps_denied_onboard.components.c11_tile_manager import (
request_hash,
)
_BASE_URL = "https://parent-suite.test"
_INVENTORY_PATH = "/api/satellite/tiles/inventory"
_TILES_PATH_PREFIX = "/tiles/"
@@ -111,14 +110,10 @@ class _StubTileWriter:
}
)
if call_index in self.rejected_indices:
raise _StubFreshnessRejection(
f"freshness rejected call_index={call_index}"
)
raise _StubFreshnessRejection(f"freshness rejected call_index={call_index}")
return self.labels_by_index.get(call_index, "fresh")
def tile_already_present(
self, *, zoom_level: int, lat: float, lon: float
) -> bool:
def tile_already_present(self, *, zoom_level: int, lat: float, lon: float) -> bool:
self.exists_calls.append((zoom_level, lat, lon))
return False
@@ -238,9 +233,9 @@ def _inventory_entry_for_coord(
) -> dict[str, Any]:
if not present:
return {
"tileZoom": int(zoom),
"tileX": int(x),
"tileY": int(y),
"z": int(zoom),
"x": int(x),
"y": int(y),
"locationHash": str(uuid4()),
"present": False,
"id": None,
@@ -251,9 +246,9 @@ def _inventory_entry_for_coord(
}
captured = captured_at or datetime(2026, 5, 13, 0, 0, 0, tzinfo=timezone.utc)
return {
"tileZoom": int(zoom),
"tileX": int(x),
"tileY": int(y),
"z": int(zoom),
"x": int(x),
"y": int(y),
"locationHash": str(uuid4()),
"present": True,
"id": str(uuid4()),
@@ -303,9 +298,9 @@ def _make_inventory_handler(
resolution = 0.5
results.append(
_inventory_entry_for_coord(
zoom=int(t["tileZoom"]),
x=int(t["tileX"]),
y=int(t["tileY"]),
zoom=int(t["z"]),
x=int(t["x"]),
y=int(t["y"]),
present=is_present,
resolution_m_per_px=resolution,
)
@@ -331,12 +326,8 @@ def _make_inventory_handler(
def test_ac1_100_tile_happy_path_writes_all(tmp_path: Path) -> None:
# Arrange — bbox at zoom 14 produces N coord candidates; the
# stub marks the first 100 as `present=true` and the rest absent.
transport = httpx.MockTransport(
_make_inventory_handler(present_count=100)
)
(downloader, _logs, writer, enforcer, _sleeps) = _build_downloader(
transport=transport
)
transport = httpx.MockTransport(_make_inventory_handler(present_count=100))
(downloader, _logs, writer, enforcer, _sleeps) = _build_downloader(transport=transport)
request = _make_request(cache_root=tmp_path)
# Act
@@ -366,9 +357,7 @@ def test_ac2_resolution_gate_rejects_sub_spec_tiles(tmp_path: Path) -> None:
resolution_override_for_first_n=(10, 0.3),
)
)
(downloader, log_records, writer, _enforcer, _sleeps) = _build_downloader(
transport=transport
)
(downloader, log_records, writer, _enforcer, _sleeps) = _build_downloader(transport=transport)
# Act
report = downloader.download_tiles_for_area(_make_request(cache_root=tmp_path))
@@ -377,7 +366,9 @@ def test_ac2_resolution_gate_rejects_sub_spec_tiles(tmp_path: Path) -> None:
assert report.tiles_rejected_resolution == 10
assert report.tiles_downloaded == 40
assert len(writer.write_calls) == 40
res_warnings = [r for r in log_records if getattr(r, "kind", "") == "c11.download.resolution_rejected"]
res_warnings = [
r for r in log_records if getattr(r, "kind", "") == "c11.download.resolution_rejected"
]
assert len(res_warnings) == 10
@@ -402,7 +393,9 @@ def test_ac3_freshness_rejections_counted_and_run_continues(tmp_path: Path) -> N
assert report.tiles_downloaded == 5
assert report.outcome == DownloadOutcome.SUCCESS
summary_warns = [
r for r in log_records if getattr(r, "kind", "") == "c11.download.freshness_rejected_summary"
r
for r in log_records
if getattr(r, "kind", "") == "c11.download.freshness_rejected_summary"
]
assert len(summary_warns) == 1
@@ -483,9 +476,7 @@ def test_ac6_persistent_5xx_raises_satellite_provider_error(tmp_path: Path) -> N
tile_response_factory=_factory,
)
)
(downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader(
transport=transport
)
(downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport)
# Act / Assert
with pytest.raises(SatelliteProviderError):
@@ -512,9 +503,7 @@ def test_ac7_401_fails_fast_no_retry(tmp_path: Path) -> None:
tile_response_factory=_factory,
)
)
(downloader, log_records, _writer, _enforcer, _sleeps) = _build_downloader(
transport=transport
)
(downloader, _log_records, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport)
# Act / Assert
with pytest.raises(SatelliteProviderError):
@@ -541,9 +530,7 @@ def test_ac8_idempotent_rerun_yields_no_op(tmp_path: Path) -> None:
tile_response_factory=_factory,
)
)
(downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader(
transport=transport
)
(downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport)
request = _make_request(cache_root=tmp_path)
# Act — first run
@@ -578,10 +565,8 @@ def test_ac9_cache_budget_pre_check_aborts(tmp_path: Path) -> None:
tile_response_factory=_factory,
)
)
enforcer = _StubBudgetEnforcer(
raise_on_call=CacheBudgetExceededError("no headroom")
)
(downloader, log_records, _writer, _enforcer, _sleeps) = _build_downloader(
enforcer = _StubBudgetEnforcer(raise_on_call=CacheBudgetExceededError("no headroom"))
(downloader, _log_records, _writer, _enforcer, _sleeps) = _build_downloader(
transport=transport, budget_enforcer=enforcer
)
@@ -606,16 +591,12 @@ def test_ac11_service_api_key_never_appears_in_logs(tmp_path: Path) -> None:
inventory_response_override=httpx.Response(401),
)
)
(downloader, log_records, _writer, _enforcer, _sleeps) = _build_downloader(
transport=transport
)
(downloader, log_records, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport)
with pytest.raises(SatelliteProviderError):
downloader.download_tiles_for_area(_make_request(cache_root=tmp_path))
# Assert
flat = " ".join(
r.getMessage() + json.dumps(getattr(r, "kv", {})) for r in log_records
)
flat = " ".join(r.getMessage() + json.dumps(getattr(r, "kv", {})) for r in log_records)
assert _API_KEY not in flat
assert "Bearer ***" in flat
@@ -642,9 +623,7 @@ def test_ac12_partial_journal_resumed_on_rerun(tmp_path: Path) -> None:
tile_response_factory=_factory,
)
)
(downloader, _logs, writer, _enforcer, _sleeps) = _build_downloader(
transport=transport
)
(downloader, _logs, writer, _enforcer, _sleeps) = _build_downloader(transport=transport)
request = _make_request(cache_root=tmp_path)
# First run — completes
@@ -771,14 +750,10 @@ def test_nfr_throughput_1000_tiles_under_budget(tmp_path: Path) -> None:
transport = httpx.MockTransport(
_make_inventory_handler(
present_count=1000,
tile_response_factory=lambda r: httpx.Response(
200, content=b"\xff\xd8tile"
),
tile_response_factory=lambda r: httpx.Response(200, content=b"\xff\xd8tile"),
)
)
(downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader(
transport=transport
)
(downloader, _logs, _writer, _enforcer, _sleeps) = _build_downloader(transport=transport)
# Big bbox at zoom 14: ~ 32x32 tile span on this latitude is enough.
# 1° ≈ 45 tiles at zoom 14 in latitude → 0.75° gives ≈ 33 tiles → ~1089 tiles.