# Contract: tile-storage **Component**: DataAccess **Producer task**: AZ-484 (v1.0.0 — multi-source schema) + AZ-503-foundation (v2.0.0 — tile identity columns) + AZ-505 (v2.0.0 freeze — covering index + location_hash-keyed reads + bulk inventory) **Consumer tasks**: AZ-485 (UAV upload endpoint, cycle 5 — `uav-tile-upload.md` v1.1.0), AZ-505 (inventory endpoint, this cycle — `tile-inventory.md` v1.0.0), future SatAR / additional-source tasks **Version**: 2.0.0 **Status**: frozen **Last Updated**: 2026-05-12 ## Purpose Defines how satellite imagery tiles are persisted in the `tiles` table when more than one acquisition source (and multiple UAV flights per source) can write to the same geographic cell, AND how the table is indexed for the two distinct read patterns it serves: 1. **Producer writes** (`POST /api/satellite/upload`, `GoogleMapsDownloaderV2`) — per-source, per-flight UPSERTs keyed by integer slippy coords. 2. **Consumer reads**: - Leaflet hot path — `GET /tiles/{z}/{x}/{y}` returns the most-recent variant by `location_hash`. - Bulk inventory — `POST /api/satellite/tiles/inventory` returns one row per `location_hash` across many cells in one round trip. Producers must agree on the source enum, `captured_at` semantics, `flight_id` semantics, and the per-(source × flight) UPSERT contract. Readers must use the `location_hash`-keyed selection rule and tolerate the multi-source / multi-flight row layout. ## Shape ### Schema (PostgreSQL `tiles` table — relevant columns only) ```sql -- Pre-existing columns (unchanged since AZ-484) id UUID PRIMARY KEY tile_zoom INT NOT NULL tile_x INT NOT NULL tile_y INT NOT NULL latitude DOUBLE PRECISION NOT NULL longitude DOUBLE PRECISION NOT NULL tile_size_meters DOUBLE PRECISION NOT NULL tile_size_pixels INT NOT NULL image_type VARCHAR(10) NOT NULL file_path VARCHAR(500) NOT NULL created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP -- AZ-484 (v1.0.0) source VARCHAR(32) NOT NULL -- enum-stored: 'google_maps' | 'uav' captured_at TIMESTAMP NOT NULL -- UTC; producer-supplied semantics, see below -- AZ-503-foundation (this contract — v2.0.0) flight_id UUID NULL -- per-UAV-flight identifier; NULL for google_maps + legacy uav location_hash UUID NOT NULL -- UUIDv5(TileNamespace, "{tile_zoom}/{tile_x}/{tile_y}") content_sha256 BYTEA NULL -- SHA-256 of JPEG body at insert time; NULL for legacy rows only legacy_id UUID NULL -- pre-AZ-503 random `id` preserved for one deprecation cycle -- Vestigial columns (preserved per coderule.mdc; readers MUST NOT depend on them) maps_version VARCHAR(50) NULL version INT NULL ``` ### Field reference (v2.0.0) | Field | Type | Required | Description | Constraints | |-------|------|----------|-------------|-------------| | `source` | enum (`TileSource`) stored as `VARCHAR(32)` | yes | Producer of the tile | `'google_maps'` or `'uav'`. New values require a contract version bump. | | `captured_at` | `TIMESTAMP` UTC | yes | Producer-defined "moment the imagery represents" | For `google_maps`: `DateTime.UtcNow` at download time (provider does not expose original imagery date). For `uav`: the UAV capture timestamp supplied by the upload client. Must be UTC; non-UTC must be converted before write. | | `flight_id` | `UUID` | no (NULL for `google_maps` + legacy `uav`) | Per-flight identifier supplied by the UAV upload endpoint | When source = `'uav'` AND tile is AZ-503+ era → NOT NULL. When source = `'google_maps'` → MUST be NULL. Pre-AZ-503 `uav` rows may have NULL. UPSERT collapses NULL via `COALESCE(flight_id, '00000000-…'::uuid)`. | | `location_hash` | `UUID` (v5) | yes | Deterministic cell identifier | `UUIDv5(Uuidv5.TileNamespace, "{tile_zoom}/{tile_x}/{tile_y}")`. Cross-repo invariant — `TileNamespace = 5b8d0c2e-7f1a-4d3b-9c5e-1f3a8e7d2b6c`. Identical byte-for-byte with `gps-denied-onboard/components/c6_tile_cache/_uuid.py:TILE_NAMESPACE`. | | `content_sha256` | `BYTEA` (32) | yes for AZ-503+ writes (NULL only for pre-AZ-503 rows the migration could not re-hash) | SHA-256 of the JPEG body at insert time | Application invariant: enforced NOT NULL on new writes via `TileEntity.ContentSha256`. Migration 014 left the column nullable because it could not safely re-open tile files on disk during schema migration. | | `legacy_id` | `UUID` | no | Pre-AZ-503 random `id` preserved for one deprecation cycle | NULL for AZ-503+ rows; populated for rows that pre-date migration 014. Will be dropped in a follow-up migration once external references are confirmed flushed. | | `(tile_zoom, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, '00000000-…'::uuid))` | composite | yes | Per-source-per-flight uniqueness | Enforced via `UNIQUE INDEX idx_tiles_unique_identity`. Replaces the AZ-484 lat/lon-keyed uniqueness from `idx_tiles_unique_location_source`. | ### Indexes (v2.0.0) ```sql -- Per-source-per-flight uniqueness. NULL-safe via COALESCE. CREATE UNIQUE INDEX idx_tiles_unique_identity ON tiles ( tile_zoom, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, '00000000-0000-0000-0000-000000000000'::uuid) ); -- Leaflet hot-path covering index. AZ-505. CREATE INDEX tiles_leaflet_path ON tiles (location_hash, captured_at DESC, updated_at DESC, id DESC) INCLUDE (file_path, source); ``` Indexes dropped in v2.0.0: - `idx_tiles_unique_location_source` (AZ-484 lat/lon-keyed uniqueness) — dropped by migration 014; superseded by `idx_tiles_unique_identity`. - `idx_tiles_unique_location` (pre-AZ-484 4-column uniqueness) — dropped by migration 013; included here for completeness. - `idx_tiles_location_hash` (lightweight lookup added by migration 014) — dropped by migration 015; superseded — equality lookups by `location_hash` use the leading column of `tiles_leaflet_path`. ### Producer write API | Operation | Repository method | Conflict semantics | |-----------|-------------------|--------------------| | Insert / replace same-(source, flight) row for a cell | `ITileRepository.InsertAsync(TileEntity)` | `ON CONFLICT (tile_zoom, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, '00000000-…'::uuid)) DO UPDATE SET file_path, latitude, longitude, captured_at, updated_at, content_sha256`. Producers MUST set `Source`, `CapturedAt`, `LocationHash`, `ContentSha256`. Producers MUST set `FlightId` when source = `uav`. | | Update by primary key | `ITileRepository.UpdateAsync(TileEntity)` | Updates by `id` only. Caller's responsibility not to violate the unique index. | | Delete by primary key | `ITileRepository.DeleteAsync(Guid)` | Removes a single row by `id`; no cascade. | ### Consumer read API and selection rule | Operation | Repository method | Selection rule | |-----------|-------------------|----------------| | Read by `id` | `ITileRepository.GetByIdAsync(Guid)` | Returns the row identified by `id` (no source/flight filter). | | Read most-recent for a cell by slippy coordinates | `ITileRepository.GetByTileCoordinatesAsync(zoom, x, y)` | Computes `location_hash = UUIDv5(TileNamespace, "{zoom}/{x}/{y}")` and returns the row with the highest `(captured_at, updated_at, id)` tuple for that hash across all sources/flights. At most one row. | | Read region | `ITileRepository.GetTilesByRegionAsync(lat, lon, sizeMeters, zoomLevel)` | Returns at most one row per `(tile_zoom, tile_x, tile_y, tile_size_meters)` group, selected by the same most-recent rule. | | Bulk inventory lookup | `ITileRepository.GetTilesByLocationHashesAsync(IReadOnlyList locationHashes)` | Returns at most one row per requested `location_hash`, selected by `DISTINCT ON (location_hash) ... ORDER BY location_hash, captured_at DESC, updated_at DESC, id DESC`. Used by the AZ-505 inventory endpoint. | The selection rule is **most-recent across all sources and flights** ordered by `captured_at DESC`, with `(updated_at DESC, id DESC)` as deterministic tie-breakers. No voting / trust-promotion filter is applied at this layer. ## Invariants - **Inv-1**: Every row has a non-null `source` whose string value is a member of `TileSource`. Rows with unknown source values are a contract violation. - **Inv-2**: Every row has a non-null `captured_at` in UTC. - **Inv-3**: At most one row exists per `(tile_zoom, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, '00000000-…'::uuid))`. NULL-coalesced flight_id is what makes `google_maps` rows (where flight_id is always NULL) deduplicate to one row per cell-and-size-and-source. - **Inv-4**: For any cell with one or more rows, the row returned by `GetByTileCoordinatesAsync` and the per-cell row returned by `GetTilesByRegionAsync` are identical. - **Inv-5**: The `source` column value space is closed: only the snake_case wire values defined in `SatelliteProvider.Common.Enums.TileSourceConverter` (`"google_maps"`, `"uav"`) are valid. Adding a new producer requires a new `TileSource` enum member, a corresponding wire value in `TileSourceConverter`, AND a contract version bump (minor). Note: `TileEntity.Source` is stored as the wire string (not the C# enum) because Dapper's `TypeHandler` for enum types is bypassed during read deserialization (Dapper issue #259); `TileSourceConverter.{ToWireValue,FromWireValue}` is the documented bridge. - **Inv-6**: `captured_at` semantics are producer-defined per the Field Reference table above; consumers MUST NOT reinterpret it (e.g., consumers MUST NOT assume `captured_at` from `google_maps` reflects original imagery date). - **Inv-7** (new in v2.0.0): `location_hash` is functionally determined by `(tile_zoom, tile_x, tile_y)` — every row with the same slippy coords has the same hash, and that hash equals `UUIDv5(Uuidv5.TileNamespace, "{tile_zoom}/{tile_x}/{tile_y}")`. The namespace constant is a cross-repository invariant that must NOT be changed unilaterally — see `coderule.mdc` "cross-repo invariants" and the AZ-503 migration header. - **Inv-8** (new in v2.0.0): When `source = 'google_maps'`, `flight_id` MUST be NULL. When `source = 'uav'` AND the row is AZ-503+ era (created after migration 014), `flight_id` SHOULD be non-NULL; legacy `uav` rows with NULL `flight_id` are tolerated for one deprecation cycle. - **Inv-9** (new in v2.0.0): `GetByTileCoordinatesAsync` filters by `location_hash`, not by `(tile_zoom, tile_x, tile_y)` directly. Callers that pass the same `(z, x, y)` tuple get byte-identical results to v1.0.0 because the hash is deterministic; this is a behavior-preserving rewrite that exists to make the leaflet hot path index-only against `tiles_leaflet_path`. ## Non-Goals - **Not covered**: Per-source / per-flight historical revision retention. Same-(source, flight) uploads to the same cell overwrite the previous row by design — this is not a versioned table. Consumers wanting season selection or rollback must propose a v3 schema. - **Not covered**: Cross-source / cross-flight merging or compositing at read time. Reads return exactly one row per cell. - **Not covered**: Quality scoring, threshold gating, or voting / trust-promotion at this layer. Voting is owned by `gps-denied-onboard` Design Task #2 and consumes `flight_id` from this contract. - **Not covered**: Backwards-compatible reads against the v1.0.0 unique index. Migration 014 is mandatory before any consumer of v2.0.0 runs. - **Not covered**: The vestigial `maps_version` and `version` columns. Consumers MUST NOT read them; producers MUST NOT write them in v1.0.0+. - **Not covered**: `content_sha256` integrity verification on read. The column is populated for new writes; downstream verification is a future-task concern. ## Versioning Rules - **Patch (2.0.x)**: Documentation clarifications, additional invariants that do not change runtime behavior, expanded test cases. - **Minor (2.x.0)**: Adding a new `TileSource` enum member; adding optional columns that consumers may safely ignore; adding new repository read methods; widening the `tiles_leaflet_path` INCLUDE list to remove heap fetches from inventory. - **Major (3.0.0)**: Removing or renaming a column; changing the unique index columns; changing the selection rule (e.g., adding source priority or voting filter); changing `captured_at` from required to optional or vice versa; introducing per-(source, flight) historical revisions; changing the `Uuidv5.TileNamespace` constant (would also break sibling repos and require coordinated cross-repo work). Each version bump requires updating the Change Log below and notifying every consumer listed in the header. If consumers' tasks have not yet been written, the producer task is responsible for surfacing the change to the user before merging. ## Test Cases | Case | Input | Expected | Notes | |------|-------|----------|-------| | valid-google-only | Insert `source='google_maps' captured_at=T1 flight_id=NULL` for a fresh cell | Single row returned by region read; `source='google_maps'`, `captured_at=T1`. | v1.0.0 baseline regression case. | | valid-multi-source | Insert `google_maps captured_at=T1`, then `uav captured_at=T2 > T1 flight_id=F1` for same cell | Both rows persisted; `GetByTileCoordinatesAsync` returns the `uav` row. | AC-1 + AC-2 of AZ-484. | | valid-multi-flight | Insert two `uav` rows with distinct `flight_id`s for same `(z, x, y)`, `captured_at=T1` and `T2 > T1` | Both rows persisted under `idx_tiles_unique_identity`; most-recent rule returns the `T2` row. | v2.0.0 new — was a unique-index violation under v1.0.0. | | same-source-same-flight-upsert | Insert `uav captured_at=T1 flight_id=F1`, then `uav captured_at=T2 > T1 flight_id=F1` for same cell | Exactly one `uav/F1` row remains, with `captured_at=T2` and updated `file_path`. | AZ-484 AC-3, preserved through AZ-503 schema rewrite. | | time-tiebreak | Insert `google_maps captured_at=T`, then `uav captured_at=T flight_id=F1` (identical timestamps) for same cell | Selection deterministic by `(updated_at DESC, id DESC)` tie-break; result must be reproducible across two test runs with the same seed. | Inv-4 enforcement. | | location-hash-stability | Compute UUIDv5 for `(z=18, x=154321, y=95812)` both in C# (`Uuidv5.Create`) and in Postgres (migration 014 helper) | Identical 16 bytes. Both equal `gps-denied-onboard`'s Python `uuid5(TILE_NAMESPACE, "18/154321/95812")`. | Inv-7 cross-repo invariant. | | leaflet-index-only | Seed ≥ 100k rows, `VACUUM ANALYZE tiles`, then `EXPLAIN SELECT file_path FROM tiles WHERE location_hash = $1 ORDER BY captured_at DESC, updated_at DESC, id DESC LIMIT 1` | Plan contains `Index Only Scan using tiles_leaflet_path`; `Heap Fetches` ≤ 1. | AZ-505 AC-3. | | bulk-inventory-ordering | `GetTilesByLocationHashesAsync` with 2500 hashes (mix of present + absent) | Result is one-row-per-distinct-hash, most-recent across (source, flight). Order is hash-keyed; caller re-aligns to request order. | AZ-505 AC-1 / AC-4. | | backfill-completeness | Migration 013 against a snapshot DB with N pre-existing rows | Post-migration row count is N; every row has `source='google_maps'` and `captured_at = created_at`. | AZ-484 AC-4. | | location-hash-backfill | Migration 014 against a snapshot DB after AZ-484 has applied | Every row has non-NULL `location_hash` matching the application-side UUIDv5 for that row's `(tile_zoom, tile_x, tile_y)`. | AZ-503-foundation guarantee. | | invalid-source | Direct SQL insert with `source='satar'` (not in enum) | Repository read either rejects deserialization or raises a contract violation; behavior MUST surface the violation, not swallow it. | Inv-1 + `coderule.mdc` "never suppress errors silently". | ## Change Log | Version | Date | Change | Author | |---------|------|--------|--------| | 1.0.0 | 2026-05-11 | Initial contract — multi-source schema (`source`, `captured_at`), 5-column unique key, most-recent-across-sources read rule. Produced by AZ-484. | autodev (Step 9) | | 2.0.0 | 2026-05-12 | **MAJOR**. Identity columns + covering-index freeze. Added columns: `flight_id` (per-UAV-flight, nullable), `location_hash` (UUIDv5, NOT NULL), `content_sha256` (BYTEA, app-NOT-NULL), `legacy_id` (pre-AZ-503 random id preserved one cycle). Replaced AZ-484 `idx_tiles_unique_location_source` (lat/lon-keyed) with `idx_tiles_unique_identity` (integer slippy + per-flight, NULL-coalesced) — migration 014. Added covering index `tiles_leaflet_path (location_hash, captured_at DESC, updated_at DESC, id DESC) INCLUDE (file_path, source)` — migration 015. Rewrote `GetByTileCoordinatesAsync` to filter on `location_hash` (behavior-preserving — same UUIDv5 deterministic on both ends — to enable index-only scan on the leaflet hot path). Added `GetTilesByLocationHashesAsync` for the AZ-505 bulk inventory endpoint. Introduced Inv-7 / Inv-8 / Inv-9. Produced jointly by AZ-503-foundation (cycle 5, columns + identity index) and AZ-505 (cycle 6, covering index + location_hash-keyed reads + bulk inventory). Consumers reviewed at bump time: AZ-485 (`uav-tile-upload.md` v1.1.0) — already aligned in cycle 5; AZ-505 (`tile-inventory.md` v1.0.0) — produced jointly. | autodev (Step 10, cycle 6) |