Files
satellite-provider/_docs/02_document/contracts/data-access/tile-storage.md
T
Oleksandr Bezdieniezhnykh e9d6db077c [AZ-484] Fix multi-source tile reads: drop Dapper enum handler
Two integration-test failures uncovered after the initial commit:

1) GetTilesByRegionAsync outer ORDER BY referenced 'updated_at' but
   the inner DISTINCT ON subquery aliased it to 'UpdatedAt' (Postgres
   folds to 'updatedat'). DISTINCT ON already guarantees one row per
   (latitude, longitude, ...) so the third tiebreak was unreachable;
   removed it.

2) Dapper 2.1.35 silently bypasses SqlMapper.TypeHandler<T> for enum
   types during read deserialization (Dapper issue #259). The
   TileSourceTypeHandler worked for writes but reads fell through to
   Enum.TryParse, which cannot map 'google_maps' to GoogleMaps.

   Pivoted: TileEntity.Source is now a string (the wire value).
   TileSource enum stays as the public producer surface in
   Common.Enums; TileSourceConverter (Common.Enums) provides
   ToWireValue / FromWireValue / IsValidWireValue at the boundary.
   TileSourceTypeHandler deleted; registration removed from
   DapperEnumTypeHandlers.RegisterAll.

   tile-storage.md Inv-5 amended to document the storage choice.
   _docs/LESSONS.md L-001 records the Dapper bypass for future cycles.

Full suite passes (213 unit + integration suite incl. AZ-484
AC-1..AC-5, security SEC-01..SEC-04, AZ-356/362/357).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 06:44:34 +03:00

8.9 KiB

Contract: tile-storage

Component: DataAccess Producer task: AZ-484 — _docs/02_tasks/todo/AZ-484_multi_source_tile_storage.md Consumer tasks: AZ-485 (planned T2 — UAV upload endpoint); future tasks adding additional sources (e.g., SatAR) Version: 1.0.0 Status: frozen Last Updated: 2026-05-11

Purpose

Defines how satellite imagery tiles are persisted in the tiles table when more than one acquisition source can write to the same geographic cell. Producers must agree on the source enum, the captured-at semantics, and the per-source UPSERT contract. Readers must use the documented selection rule and tolerate the multi-source row layout.

Shape

Schema (PostgreSQL tiles table — relevant columns only)

-- Pre-existing columns (unchanged)
id                  UUID            PRIMARY KEY
tile_zoom           INT             NOT NULL
tile_x              INT             NOT NULL
tile_y              INT             NOT NULL
latitude            DOUBLE PRECISION NOT NULL
longitude           DOUBLE PRECISION NOT NULL
tile_size_meters    DOUBLE PRECISION NOT NULL
tile_size_pixels    INT             NOT NULL
image_type          VARCHAR(10)     NOT NULL
file_path           VARCHAR(500)    NOT NULL
created_at          TIMESTAMP       NOT NULL DEFAULT CURRENT_TIMESTAMP
updated_at          TIMESTAMP       NOT NULL DEFAULT CURRENT_TIMESTAMP

-- New in v1.0.0 (this contract)
source              VARCHAR(32)     NOT NULL    -- enum-stored: 'google_maps' | 'uav'
captured_at         TIMESTAMP       NOT NULL    -- UTC; producer-supplied semantics, see below

-- Vestigial columns (preserved per coderule.mdc; readers MUST NOT depend on them)
maps_version        VARCHAR(50)     NULL
version             INT             NULL

Field reference

Field Type Required Description Constraints
source enum (TileSource) stored as VARCHAR(32) yes Producer of the tile 'google_maps' or 'uav'. New values require a contract version bump.
captured_at TIMESTAMP UTC yes Producer-defined "moment the imagery represents" For google_maps: DateTime.UtcNow at download time (provider does not expose original imagery date). For uav: the UAV capture timestamp supplied by the upload client. Must be UTC; non-UTC must be converted before write.
(latitude, longitude, tile_zoom, tile_size_meters, source) composite yes Per-source uniqueness Enforced via UNIQUE INDEX idx_tiles_unique_location_source.

Index

CREATE UNIQUE INDEX idx_tiles_unique_location_source
    ON tiles (latitude, longitude, tile_zoom, tile_size_meters, source);

The previous 4-column unique index (latitude, longitude, tile_zoom, tile_size_meters) from migration 012 is dropped.

Producer write API

Operation Repository method Conflict semantics
Insert / replace same-source row for a cell ITileRepository.InsertAsync(TileEntity) ON CONFLICT (latitude, longitude, tile_zoom, tile_size_meters, source) DO UPDATE SET file_path, tile_x, tile_y, captured_at, updated_at. Producers MUST set Source and CapturedAt.
Update by primary key ITileRepository.UpdateAsync(TileEntity) Updates by id only. Caller's responsibility not to violate the unique index.
Delete by primary key ITileRepository.DeleteAsync(Guid) Removes a single row by id; no cascade.

Consumer read API and selection rule

Operation Repository method Selection rule
Read by id ITileRepository.GetByIdAsync(Guid) Returns the row identified by id (no source filter).
Read most-recent for a cell by slippy coordinates ITileRepository.GetByTileCoordinatesAsync(zoom, x, y) Returns the row with the highest (captured_at, updated_at, id) tuple across all sources for that cell. At most one row.
Read region ITileRepository.GetTilesByRegionAsync(lat, lon, sizeMeters, zoomLevel) Returns at most one row per (latitude, longitude, tile_zoom, tile_size_meters) group, selected by the same most-recent rule.

The selection rule is most-recent across all sources ordered by captured_at DESC, with (updated_at DESC, id DESC) as deterministic tie-breakers.

Invariants

  • Inv-1: Every row has a non-null source whose string value is a member of TileSource. Rows with unknown source values are a contract violation.
  • Inv-2: Every row has a non-null captured_at in UTC.
  • Inv-3: At most one row exists per (latitude, longitude, tile_zoom, tile_size_meters, source).
  • Inv-4: For any cell with one or more rows, the row returned by GetByTileCoordinatesAsync and the per-cell row returned by GetTilesByRegionAsync are identical.
  • Inv-5: The source column value space is closed: only the snake_case wire values defined in SatelliteProvider.Common.Enums.TileSourceConverter ("google_maps", "uav") are valid. Adding a new producer requires a new TileSource enum member, a corresponding wire value in TileSourceConverter, AND a contract version bump (minor). Note: TileEntity.Source is stored as the wire string (not the C# enum) because Dapper's TypeHandler<T> for enum types is bypassed during read deserialization (Dapper issue #259); TileSourceConverter.{ToWireValue,FromWireValue} is the documented bridge.
  • Inv-6: captured_at semantics are producer-defined per the Field Reference table above; consumers MUST NOT reinterpret it (e.g., consumers MUST NOT assume captured_at from google_maps reflects original imagery date).

Non-Goals

  • Not covered: Per-source historical revision retention. Same-source uploads to the same cell overwrite the previous row by design — this is not a versioned table. Consumers wanting season selection or rollback must propose a v2 schema.
  • Not covered: Cross-source merging or compositing at read time. Reads return exactly one row per cell.
  • Not covered: Quality scoring, threshold gating, or any policy beyond the selection rule. Quality enforcement happens upstream of the write (T2).
  • Not covered: Backwards-compatible reads against the legacy 4-column unique index. Migration 013 is mandatory before any consumer of v1.0.0 runs.
  • Not covered: The vestigial maps_version and version columns. Consumers MUST NOT read them; producers MUST NOT write them in v1.0.0+.

Versioning Rules

  • Patch (1.0.x): Documentation clarifications, additional invariants that do not change runtime behavior, expanded test cases.
  • Minor (1.x.0): Adding a new TileSource enum member; adding optional columns that consumers may safely ignore; relaxing constraints in a backward-compatible way.
  • Major (2.0.0): Removing or renaming a column; changing the unique index columns; changing the selection rule (e.g., adding source priority); changing captured_at from required to optional or vice versa; introducing per-source historical revisions.

Each version bump requires updating the Change Log below and notifying every consumer listed in the header. If consumers' tasks have not yet been written, the producer task is responsible for surfacing the change to the user before merging.

Test Cases

Case Input Expected Notes
valid-google-only Insert source='google_maps' captured_at=T1 for a fresh cell Single row returned by region read; source='google_maps', captured_at=T1. Baseline regression case.
valid-multi-source Insert google_maps captured_at=T1, then uav captured_at=T2 > T1 for same cell Both rows persisted; GetByTileCoordinatesAsync returns the uav row. AC-1 + AC-2 of producer task.
same-source-upsert Insert uav captured_at=T1, then uav captured_at=T2 > T1 for same cell Exactly one uav row remains, with captured_at=T2 and updated file_path. AC-3 of producer task.
time-tiebreak Insert google_maps captured_at=T, then uav captured_at=T (identical) for same cell Selection deterministic by (updated_at DESC, id DESC) tie-break; result must be reproducible across two test runs with the same seed. Inv-4 enforcement.
backfill-completeness Migration 013 against a snapshot DB with N pre-existing rows Post-migration row count is N; every row has source='google_maps' and captured_at = created_at. AC-4 of producer task.
invalid-source Direct SQL insert with source='satar' (not in enum) Repository read either rejects deserialization or raises a contract violation; behavior MUST surface the violation, not swallow it. Inv-1 + coderule.mdc "never suppress errors silently".

Change Log

Version Date Change Author
1.0.0 2026-05-11 Initial contract — multi-source schema (source, captured_at), 5-column unique key, most-recent-across-sources read rule. Produced by AZ-484. autodev (Step 9)