Co-authored-by: Cursor <cursoragent@cursor.com>
26 KiB
C11 — Tile Manager
1. High-Level Overview
Purpose: own the operator-side network I/O against satellite-provider for the onboard tile corpus, in three directions:
- Route seed (pre-flight, F1, route-driven variant — Cycle 3 / Epic AZ-835): submit a tlog-derived
RouteSpec(waypoints + per-waypoint coverage radius, produced byreplay_input.tlog_route.extract_route_from_tlog— AZ-836) tosatellite-provider's Route API and poll until corridor tile materialisation completes. Lets the operator pre-commit the cache to where the drone actually flew rather than a bounding box. - Download (pre-flight, F1): fetch tiles from
satellite-providerfor the operational area, apply AC-NEW-6 freshness gating, and write into C6 (TileStore+TileMetadataStore). C11 is the only path that crosses the workstation/companion enclave to the parent suite for tile pixels — C10 reads from the populated C6 store and never touchessatellite-provideritself. - Upload (post-landing, F10): read pending mid-flight tiles from C6 and POST to
satellite-provider's ingest endpoint (D-PROJ-2 contract sketch). C11 itself does NOT gate on flight state — it is a dumb pipe; the post-landing safety gate is owned by C12'sPostLandingUploadOrchestrator(AZ-329 / Batch 44), which checks the C13flight_footerFDR record forclean_shutdown=Truebefore invokingTileUploader.upload_pending_tiles.
C11 is a separate operator-side binary / image. The airborne companion image's CMake target deliberately excludes the entire c11_tilemanager/ source tree so the airborne process cannot accidentally execute the seed path, the download path, or the upload path even via reflection or config error (ADR-004 process-level isolation, AC-8.4). All three directions of tile I/O are operator-driven on the operator workstation; the companion only consumes the populated C6 store while airborne.
Architectural Pattern: Pipeline behind three interfaces (SatelliteProviderRouteClient, TileDownloader, TileUploader) under one component, consistent with C8's multi-interface shape (FC-AP, FC-iNav, GCS adapters under one component). The three interfaces are bundled into C11 because they share auth (JWT Bearer + optional TLS-insecure flag for dev self-signed certs across all three; the upload direction additionally signs each tile with the per-flight onboard signing key), HTTP client (httpx), network configuration, deployment unit (operator-tooling tarball), and the airborne-exclusion property — splitting them into separate components would duplicate all of that. They are kept as three interfaces so SRP is preserved at the call-site level: C12 binds SatelliteProviderRouteClient.seed_route to materialise the corridor cache from a tlog (cycle 3 e2e fixture today; planned C12 production path), TileDownloader.download_tiles_for_area for the F1 bbox-driven cache-build workflow, TileUploader.upload_pending_tiles for the F10 post-landing trigger; none is forced to depend on the others.
Cycle-1 operational reality: C11 is operator-workstation-only, NOT an airborne strategy slot — there is no c11_tile_manager slot in _AIRBORNE_REGISTRATIONS, no row in AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS, and the airborne companion image's build target deliberately excludes the entire c11_tile_manager/ source tree (ADR-004 process-level isolation; AC-8.4). The operator binary composes C11 via runtime_root/c11_factory.py, which exposes three tiny per-service factories — build_per_flight_key_manager (AZ-318), build_tile_uploader (AZ-319 + AZ-320), and build_tile_downloader (AZ-316) — each called explicitly by C12's CLI; no central registry. FDR wiring goes through the per-producer make_fdr_client cache: AZ-318 PerFlightKeyManager defaults to make_fdr_client("c11_tile_manager.signing_key", config), AZ-319 HttpTileUploader to make_fdr_client("c11_tile_manager.tile_uploader", config) — both distinct from the airborne "airborne_main" producer, so the operator-workstation process gets its own per-component FdrClient instances rather than sharing the airborne singleton. AZ-320's IdempotentRetryTileUploader decorator wraps HttpTileUploader by default (per-call + per-tile bounded retry); config.components['c11_tile_manager'].disable_retry_decorator = True suppresses the wrap for low-level debugging or test wiring that needs to observe the inner uploader. The AZ-507 cross-component cut keeps C11 from importing C6 directly: tile_store / tile_metadata_store are passed in by the operator-binary composition root as consumer-side cuts; http_client (an httpx.Client) is also caller-owned so tests can swap in httpx.MockTransport. AZ-687 replay-mode guard does not apply — C11 has no airborne footprint.
Cycle-3 operational reality (AZ-777 Phase 1 + Epic AZ-835): the e2e harness now wires the e2e-runner against the real parent-suite satellite-provider .NET service in docker-compose.test.jetson.yml (lineage AZ-688 / AZ-691 / AZ-692; tier-1 docker-compose.test.yml deprecated 2026-05-20). Two consequences cascaded into C11:
TileDownloadercontract adaptation (AZ-777 Phase 1) —HttpTileDownloader._INVENTORY_PATH = "/api/satellite/tiles/inventory"(POST, bulk lookup by (z,x,y)) andHttpTileDownloader._TILES_PATH = "/tiles"(GET, slippy-map fetch via/tiles/{z}/{x}/{y}). Previously documented asGET /api/satellite/tiles?bbox=…&zoom=…; the realsatellite-providerAPI surface uses the inventory + slippy-map split pertile-inventory.mdv1.0.0 (AZ-505). The bbox-drivendownload_tiles_for_areaentry point and itsDownloadRequest/DownloadBatchReportDTOs are unchanged at the call-site level; the contract adaptation is internal. Because the inventory response does not carry aContent-Lengthhint, AZ-308's pre-write budget check uses_DEFAULT_ESTIMATED_TILE_BYTES = 50 000(conservative over-reserve; typical 256×256 JPEG basemap tile is 8–80 KiB). Auth isAuthorization: Bearer ${SATELLITE_PROVIDER_API_KEY}; the dev-onlySATELLITE_PROVIDER_TLS_INSECURE=1env knob accepts the self-signed dev cert.- Third interface —
SatelliteProviderRouteClient(AZ-838 / Epic AZ-835 C2) —seed_route(spec: RouteSpec) -> RouteSeedResultPOSTs the spec toPOST /api/satellite/route(requestMaps=true,createTilesZip=false), pollsGET /api/satellite/route/{id}untilmapsReady=true(or a terminal-failure status), then verifies coverage viaPOST /api/satellite/tiles/inventory. Pre-emptively enforces AZ-809'sCreateRouteRequestValidatorbounds (points2..500;regionSizeMeters100..10 000;zoomLevel0..22; lat/lon ranges) so obviously-bad input fails before the HTTP POST. Default cadence:poll_interval_s = 5.0,poll_max_attempts = 60,request_timeout_s = 30.0. Errors form a dedicated hierarchy (RouteValidationError4xx + RFC 7807 ProblemDetails;RouteTransientError5xx / network / timeout with__cause__set;RouteTerminalFailureErrorfor non-success terminal status) rooted atSatelliteProviderRouteError— independent ofTileManagerErrorbecause the Route API is a corridor-onboarding flow, not a per-tile transfer.
The route-driven path is exercised today by tests/e2e/replay/conftest.py::operator_pre_flight_setup (AZ-839 — replaces the cycle-1 mkdir placeholder; yields a PopulatedC6Cache dataclass) and tests/e2e/replay/test_az835_e2e_real_flight.py (AZ-840 — single test that takes only (tlog, video, calibration) and runs the full 7-step pipeline). The C12 production CLI binding for the route path is a future-cycle integration; today's C12 still drives only download_tiles_for_area for production pre-flight cache builds.
Upstream dependencies:
- C12 OperatorTooling → invokes
TileDownloader.download_tiles_for_area(...)during F1 andTileUploader.upload_pending_tiles(...)post-landing. (Cycle-3 e2e fixtures also driveSatelliteProviderRouteClient.seed_route(...)for the route-driven F1 variant; C12 production binding for the route path is a future cycle.) - C6 TileStore + TileMetadataStore → write target during download (
source = googlemaps); read source during upload (source = onboard_ingest,voting_status = pending). replay_input.tlog_route.RouteSpec(AZ-836;_types/route.pycanonical home per AZ-845) → input DTO toSatelliteProviderRouteClient.seed_route.- Operator workstation OS → invocation entry point (CLI / tray app, owned by C12).
satellite-provider(external) → for download:POST /api/satellite/tiles/inventory(bulk lookup by (z,x,y)) +GET /tiles/{z}/{x}/{y}(slippy-map fetch, pertile-inventory.mdv1.0.0 / AZ-505); for route seeding:POST /api/satellite/route+GET /api/satellite/route/{id}(perCreateRouteRequest.csDTO + AZ-809 validator); for upload:POST /api/satellite/tiles/ingest(D-PROJ-2 design task #1, planned, not yet implemented service-side).
Downstream consumers:
- C10 CacheProvisioner reads the populated C6 store after a
TileDownloaderrun completes; C10 does not call C11 directly. C12 sequences the two steps. - On the upload side: none on the onboard side; the parent-suite voting layer (D-PROJ-2 design task #2) consumes the uploaded tiles asynchronously.
2. Internal Interfaces
Interface: SatelliteProviderRouteClient (cycle 3 — AZ-838 / Epic AZ-835 C2)
| Method | Input | Output | Async | Error Types |
|---|---|---|---|---|
seed_route |
RouteSpec (from _types/route.py; name: str | None optional) |
RouteSeedResult |
No (poll loop; seconds–minutes) | RouteValidationError, RouteTransientError, RouteTerminalFailureError (all under SatelliteProviderRouteError) |
Interface: TileDownloader
| Method | Input | Output | Async | Error Types |
|---|---|---|---|---|
download_tiles_for_area |
DownloadRequest |
DownloadBatchReport |
No (offline; minutes) | SatelliteProviderError, FreshnessRejectionError, ResolutionRejectionError, CacheBudgetExceededError, IdempotentNoOp |
enumerate_remote_coverage |
bbox: BoundingBox, zoom_levels: list[int] |
list[TileSummary] |
No | SatelliteProviderError |
Interface: TileUploader
| Method | Input | Output | Async | Error Types |
|---|---|---|---|---|
enumerate_pending_tiles |
flight_id: uuid (optional) |
list[TileMetadata] |
No | TileMetadataError |
upload_pending_tiles |
UploadRequest |
UploadBatchReport |
No | SatelliteProviderError, RateLimitedError, SignatureRejectedError |
C11 no longer exposes confirm_flight_state — the post-landing flight-state gate moved to C12 (PostLandingUploadOrchestrator, AZ-329) per Batch 44. FlightStateNotOnGroundError is retired from C11; the corresponding refusal now lives at the C12 boundary as FlightStateNotConfirmedError.
Input/Output DTOs:
RouteSpec (cycle 3 — _types/route.py, produced by replay_input/tlog_route.py):
waypoints: tuple[tuple[float, float], ...] # (lat, lon), 1..max_waypoints
suggested_region_size_meters: float # per-waypoint coverage radius
source_tlog: Path # provenance
source_segment: tuple[int, int] # (start_idx, end_idx) into tlog GPS rows
total_distance_meters: float # along-track distance of active segment
RouteSeedResult (cycle 3 — c11_tile_manager.route_client):
route_id: uuid
terminal_status: string
maps_ready: bool
tile_count: int
elapsed_ms: int
submitted_payload_sha256: string
DownloadRequest:
bbox: BoundingBox (lat_min, lon_min, lat_max, lon_max)
zoom_levels: list[int]
sector_class: enum {active_conflict, stable_rear}
satellite_provider_url: URL
service_api_key: string
cache_root: Path
DownloadBatchReport:
tiles_downloaded: int
tiles_rejected_freshness: int
tiles_rejected_resolution: int # RESTRICT-SAT-4 < 0.5 m/px
tiles_downgraded: int
freshness_summary: dict[freshness_label, count]
outcome: enum {success, failure, idempotent_no_op}
failure_reason: string (optional)
UploadRequest:
flight_id: uuid (optional; defaults to all flights with pending tiles)
batch_size: int
satellite_provider_url: URL
UploadBatchReport:
batch_uuid: uuid (assigned by satellite-provider per D-PROJ-2 contract)
per_tile_status: list[(tile_id, status: enum {queued, rejected, duplicate, superseded})]
retry_count: int
next_retry_at_s: int (when retried)
outcome: enum {success, partial, failure}
3. External API Specification
C11 is a client of satellite-provider's REST surface in three directions.
3.1 Route seed — corridor materialisation (cycle 3 — AZ-838 / Epic AZ-835 C2)
| Endpoint | Method | Auth | Rate Limit | Description |
|---|---|---|---|---|
/api/satellite/route |
POST | JWT Bearer (SATELLITE_PROVIDER_API_KEY) + optional dev-only SATELLITE_PROVIDER_TLS_INSECURE=1 |
parent-suite enforces | Submit a RouteSpec (waypoints + region size + zoom level). Body shape per CreateRouteRequest.cs / RoutePoint.cs (lat / lon JSON property names) / GeoPoint.cs DTOs. Query: requestMaps=true&createTilesZip=false. Validated pre-emptively against AZ-809 CreateRouteRequestValidator rules. |
/api/satellite/route/{id} |
GET | same as above | parent-suite enforces | Poll route processing status. Returns mapsReady: bool + a status string. Terminal-success: mapsReady=true. Terminal-failure: status ∈ {failed, error, rejected}. Default cadence: 5 s × ≤ 60 attempts. |
3.2 Download — read path (satellite-provider v1.0.0 inventory contract — AZ-505 / AZ-777 Phase 1)
| Endpoint | Method | Auth | Rate Limit | Description |
|---|---|---|---|---|
/api/satellite/tiles/inventory |
POST | JWT Bearer (SATELLITE_PROVIDER_API_KEY) + optional dev-only SATELLITE_PROVIDER_TLS_INSECURE=1 |
parent-suite enforces | Bulk lookup of (zoom, x, y) slippy-map coords (≤ 5000 entries / request); body shape per tile-inventory.md v1.0.0. Response order matches request order; each entry carries `present: true |
/tiles/{z}/{x}/{y} |
GET | same as above | parent-suite enforces | Slippy-map tile fetch by coordinates (binary JPEG response). Issued only for inventory entries with present=true. |
C11 honours Retry-After on 429s, fails fast on TLS / auth errors, retries with backoff on 5xx. Resolution below 0.5 m/px (RESTRICT-SAT-4) is rejected at the C11 boundary, not pushed downstream. Because the inventory response carries no Content-Length hint, AZ-308's pre-write budget check uses a conservative _DEFAULT_ESTIMATED_TILE_BYTES = 50 000 per-tile reserve.
3.3 Upload — write path (D-PROJ-2 contract sketch, planned)
| Endpoint | Method | Auth | Rate Limit | Description |
|---|---|---|---|---|
/api/satellite/tiles/ingest (parent-suite, planned) |
POST | Per-flight onboard signing key (D-C8-9 = (d) family); each tile carries the signature | parent-suite enforces | Multipart upload of one or more tiles; response 202 with batch UUID + per-tile status. |
Request schema (multipart fields per tile):
tile_blob(JPEG body, byte-identical tosatellite-provider's existing tile format)zoomLevel(int)latitude/longitude(double)tile_size_meters(double),tile_size_pixels(int)capture_timestamp(ISO 8601),flight_id(UUID),companion_id(string)quality_metadata(JSON;TileQualityMetadataper data_model.md)signature(per-flight onboard signing key signature over the payload)
Response: 202 Accepted + {batch_uuid: UUID, per_tile_status: [...]}.
Test substitute during NFT-SEC-01 / FT-P-17 / IT runs: the e2e-test mock-suite-sat-service fixture (under tests/fixtures/mock-suite-sat-service/) implements the planned POST surface so upload integration tests can run before D-PROJ-2 ships service-side. Download integration tests run against the real satellite-provider (its existing GET surface is already implemented). The mock is not a component and is never reached in production.
4. Data Access Patterns
C11 reads from / writes to C6 (the local store) and reads from / writes to satellite-provider (network). It owns no relational state of its own beyond a small download-progress journal and a small upload-progress journal.
Caching Strategy
| Data | Cache Type | TTL | Invalidation |
|---|---|---|---|
| Download-progress journal | filesystem alongside the operator workstation cache root | until a download_tiles_for_area run completes |
per-area run on completion |
| Pending-upload journal | filesystem alongside the operator workstation cache root | until upload acknowledged | per-batch acknowledgment removes from journal |
Storage Estimates
| Table/Collection | Est. Row Count (1yr) | Row Size | Total Size | Growth Rate |
|---|---|---|---|---|
| Download-progress journal | a few hundred rows per area provisioning | ~256 B / row | <1 MB | per provisioning run |
| Pending-upload journal | a few hundred per flight | ~256 B / row | <1 MB | per flight |
Data Management
Seed data: none — both journals are empty until the operator triggers a download or an upload run.
Rollback: the download path is idempotent — re-running download_tiles_for_area for an unchanged (bbox, zoom_levels, sector_class) triggers C10's manifest-hash check (D-C10-1) downstream and the engine/descriptor build is skipped. The upload path is idempotent on the service side via the (zoomLevel, lat, lon, capture_timestamp, companion_id, flight_id) dedup key.
5. Implementation Details
Algorithmic Complexity:
- Route seed: bounded by parent-suite tile materialisation latency (~seconds–minutes for the Derkachi corridor; gated by
poll_max_attempts × poll_interval_s). - Download: linear in tile count; bandwidth-bound by the operator workstation's link to
satellite-provider. - Upload: linear in pending tile count; bandwidth-bound; bursty post-landing.
State Management: stateless except for the two journals (download / pending-upload). The route client is fully stateless — each seed_route call submits, polls, verifies, and returns.
Key Dependencies:
| Library | Version | Purpose |
|---|---|---|
| httpx | per project pin | POST inventory + GET slippy-map (download), POST route + GET status (route seed), multipart POST (upload) to satellite-provider |
| atomicwrites | latest | Journal updates |
| cryptography | per project pin | Per-flight signing key (upload payload signing); the production satellite-provider ingest endpoint and the e2e-test mock-suite-sat-service fixture both verify with the same key family |
Error Handling Strategy:
SatelliteProviderError: HTTP timeout / 5xx / TLS failure on download / upload. Retry-with-backoff on 5xx; fail fast on TLS / auth. On download, surface to operator + takeoff blocked. On upload, leave tiles in the pending-upload journal and surface to operator. Do not delete uploaded tiles from C6 until acknowledged.RateLimitedError(429): obeyRetry-After; the operator can also re-invoke later. Same handling either direction.FreshnessRejectionError/ResolutionRejectionError: download-side only. Per AC-NEW-6 / RESTRICT-SAT-4 — never silently downgrade fresh-required tiles inactive_conflictsectors. Surface counts in theDownloadBatchReport.CacheBudgetExceededError: download-side only. Pre-flight free-space check against AC-8.3 (≤ 10 GB). Fail fast with explicit budget delta; no partial write.SignatureRejectedError: upload-side only. Per-flight signing key was rejected bysatellite-provider. This is a security-critical event — do NOT silently drop; surface to operator + log to FDR.- Route-seed errors (cycle 3, dedicated hierarchy under
SatelliteProviderRouteError):RouteValidationError(4xx + RFC 7807errorsdict; raised pre-emptively for AZ-809 validator violations BEFORE the HTTP POST),RouteTransientError(5xx / network / timeout; carries__cause__),RouteTerminalFailureError(parent suite reports a non-success terminal status;.detailcarries the response JSON). Separate hierarchy fromTileManagerErrorbecause the route flow is corridor onboarding, not per-tile transfer.
Post-landing safety: C11's upload path no longer gates on flight state internally. The check now lives in C12's PostLandingUploadOrchestrator (AZ-329 / Batch 44), which refuses to invoke TileUploader.upload_pending_tiles unless the C13 flight_footer FDR record records clean_shutdown=True for the target flight. ADR-004 process-level isolation remains the primary control — C11 should never run on the companion at all.
6. Extensions and Helpers
| Helper | Purpose | Used By |
|---|---|---|
TileSignaturePayloadBuilder |
constructs the signed payload for D-PROJ-2 contract (upload) | C11 only — keep inside the component |
7. Caveats & Edge Cases
Known limitations:
- D-PROJ-2 ingest endpoint is NOT yet implemented service-side. Until parent-suite delivers the endpoint, C11 will fail every upload — the pending-upload journal accumulates. Operator workflow tolerates this.
- The e2e-test
mock-suite-sat-servicefixture implements only the planned POST upload contract (per the leftover file). Download + route-seed integration tests run against the realsatellite-provideron the Jetson harness. Production runs reachsatellite-providerdirectly in all three directions; the fixture is never on the production path. TileDownloaderandSatelliteProviderRouteClientrequire the operator workstation to have network reach tosatellite-provider(the only path that crosses out of the workstation enclave). Pre-flight network configuration is an operator concern owned by C12; C11 fails fast if reachability is missing.- Imagery source license attribution (cycle 3 — AZ-777 Phase 2): the Jetson
satellite-providerinstance downloads from the Google Maps satellite layer (lyrs=s), governed by Google Maps Platform Terms of Service. Dev/research use only; the operator-side seed scripts (tests/fixtures/derkachi_c6/seed_region.py,seed_route.py) propagate the "Imagery © Google" attribution string. Production deployment requires either a Google Maps Platform licensing review or migration to a true CC-BY satellite source on the satellite-provider side (parent-suite ticket TBD; surfaced in_docs/00_problem/input_data/flight_derkachi/README.md). - Dev TLS cert: the e2e-runner today accepts the self-signed dev cert via
SATELLITE_PROVIDER_TLS_INSECURE=1. Production deploys must validate against a CA-issued cert (SATELLITE_PROVIDER_TLS_INSECURE=0); the env knob is documented in.env.test.example+ the smoke test + this section as development-only.
Potential race conditions:
- If the operator launches two
TileDownloaderruns concurrently against the same cache root, a filesystem lockfile (operated by C12 tooling) prevents corrupting C6's tile rows. Same lockfile gates concurrentTileUploaderinvocations.
Performance bottlenecks:
- Route seed: parent-suite tile-materialisation latency dominates (corridor onboarding from Google Maps upstream). Bounded by
poll_max_attempts × poll_interval_s(default 60 × 5 s = 5 min wall-clock ceiling). - Download: bandwidth-bound by the operator workstation's
satellite-providerlink; descriptor / engine work is downstream in C10 (offline, minutes). - Upload: bandwidth-bound. Per-flight upload volume is bounded by the F4 mid-flight tile gen cap (typically a few hundred tiles, each 50–200 KB → tens of MB per flight).
8. Dependency Graph
Must be implemented after: C6 (read source for upload, write target for download), satellite-provider (download + route-seed paths; existing) + D-PROJ-2 endpoint (upload path; the e2e-test mock-suite-sat-service fixture covers tests until the real endpoint ships). replay_input.tlog_route (AZ-836) is a soft prerequisite for the route-seed path — the route client accepts any RouteSpec regardless of how it was produced, but the cycle-3 e2e fixture wires extract_route_from_tlog upstream.
Can be implemented in parallel with: anything except C6 changes.
Blocks: F1 (pre-flight cache build cannot start without TileDownloader or — for the route-driven variant — SatelliteProviderRouteClient.seed_route), F10 (post-landing upload cannot start without TileUploader).
9. Logging Strategy
| Log Level | When | Example |
|---|---|---|
| ERROR | SignatureRejectedError, persistent SatelliteProviderError, CacheBudgetExceededError, RouteTerminalFailureError |
C11 upload failure: signature rejected by satellite-provider; c11.route.poll.terminal kind=failed route_id=… |
| WARN | one-off network failure, scheduled retry, freshness-driven rejections (counts), RouteTransientError retries, RouteValidationError pre-flight rejections |
C11 batch upload retry: batch_uuid=…; next_retry_in_s=30; c11.route.validation_failed field=points reason=below_min(2) |
| INFO | session start/end; per-batch report (download + upload); route submit + each poll tick + inventory verify | C11 download complete: 87654 tiles, 12 stale-rejected; bbox=…; c11.route.submit route_id=…; c11.route.poll.tick attempt=3 status=processing |
| DEBUG | per-tile request/response; per-tile inventory entries | C11 tile uploaded: tile_id=(z=18,lat=…,lon=…); status=queued |
Cycle-3 route-client log kinds: c11.route.submit, c11.route.poll.tick, c11.route.poll.terminal, c11.route.inventory, c11.route.validation_failed (component c11_tile_manager.route_client).
Log format: structured JSON.
Log storage: operator workstation log file (e.g. ~/.azaion/onboard/c11-tilemanager.log); also writes per-run summaries (download report, upload report) to the operator workstation cache root for audit. The companion's FDR is NOT involved (C11 doesn't run on the companion).