Files
satellite-provider/_docs/02_document/data_model.md
T
Oleksandr Bezdieniezhnykh 909f69cb3a [AZ-505] Tile inventory endpoint + HTTP/2 + Leaflet covering index
Production code:
- POST /api/satellite/tiles/inventory (XOR body, 5000-cap,
  most-recent-per-location_hash select, present/absent shaping).
- Kestrel HttpProtocols.Http1AndHttp2 on every listener (AC-5).
- Migration 015 creates tiles_leaflet_path covering index over
  (location_hash, captured_at DESC, updated_at DESC, id DESC)
  INCLUDE (file_path, source); drops superseded idx_tiles_location_hash.
- TileRepository.GetByTileCoordinatesAsync rewired to filter by
  location_hash (Index Only Scan via tiles_leaflet_path).
- TileRepository.GetTilesByLocationHashesAsync added with Npgsql-
  direct ANY($1::uuid[]) binding (Dapper IEnumerable expansion is
  incompatible with the array form).
- Uuidv5.LocationHashForTile centralises the UUIDv5(TileNamespace,
  "{z}/{x}/{y}") formula — single source of truth for the cross-repo
  invariant (gps-denied-onboard parity).

Contracts:
- New: contracts/api/tile-inventory.md v1.0.0.
- Bumped: contracts/data-access/tile-storage.md to v2.0.0 (joint
  ownership by AZ-503-foundation + AZ-505: schema + covering index +
  GetByTileCoordinatesAsync rewrite).

Tests:
- TileInventoryTests covers AC-1, AC-2 (DB-level), AC-4, AC-6.
- Http2MultiplexingTests covers AC-5 (20 concurrent multiplexed GETs
  over h2c via SocketsHttpHandler + AppContext Http2Unencrypted switch).
- LeafletPathIndexOnlyTests covers AC-3 (EXPLAIN (ANALYZE, BUFFERS)
  asserts Index Only Scan over tiles_leaflet_path with heap_blocks=0).

Docs:
- architecture.md, system-flows.md, data_model.md, module-layout.md,
  glossary.md, modules/api_program.md, modules/dataaccess_tile_repository.md,
  components/02_data_access/description.md all updated to reference the
  v2.0.0 tile-storage contract + new tile-inventory contract + AC-7.

Reports:
- batch_01_cycle6_report.md, batch_01_cycle6_review.md,
  implementation_completeness_cycle6_report.md (PASS),
  implementation_report_tile_inventory_cycle6.md.

Task spec moved todo/ -> done/.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 21:16:37 +03:00

15 KiB

Satellite Provider — Data Model

Entity-Relationship Diagram

erDiagram
    TILES {
        uuid id PK "AZ-503: deterministic UUIDv5"
        int tile_zoom
        float latitude
        float longitude
        float tile_size_meters
        int tile_size_pixels
        varchar image_type
        varchar maps_version
        int version
        varchar source
        timestamp captured_at
        uuid flight_id "AZ-503 nullable"
        uuid location_hash "AZ-503 NOT NULL"
        bytea content_sha256 "AZ-503 nullable (app-NOT-NULL for new)"
        uuid legacy_id "AZ-503 nullable, pre-migration id"
        varchar file_path
        int tile_x
        int tile_y
        timestamp created_at
        timestamp updated_at
    }

    REGIONS {
        uuid id PK
        float latitude
        float longitude
        float size_meters
        int zoom_level
        varchar status
        bool stitch_tiles
        varchar csv_file_path
        varchar summary_file_path
        int tiles_downloaded
        int tiles_reused
        timestamp created_at
        timestamp updated_at
    }

    ROUTES {
        uuid id PK
        varchar name
        text description
        float region_size_meters
        int zoom_level
        float total_distance_meters
        int total_points
        bool request_maps
        bool maps_ready
        bool create_tiles_zip
        varchar tiles_zip_path
        varchar csv_file_path
        varchar summary_file_path
        varchar stitched_image_path
        timestamp created_at
        timestamp updated_at
    }

    ROUTE_POINTS {
        uuid id PK
        uuid route_id FK
        int sequence_number
        float latitude
        float longitude
        varchar point_type
        int segment_index
        float distance_from_previous
        timestamp created_at
    }

    ROUTE_REGIONS {
        uuid route_id FK
        uuid region_id FK
        bool is_geofence
        int geofence_polygon_index
        timestamp created_at
    }

    ROUTES ||--o{ ROUTE_POINTS : "has many"
    ROUTES ||--o{ ROUTE_REGIONS : "has many"
    REGIONS ||--o{ ROUTE_REGIONS : "linked via"

Tables

tiles

Stores metadata for downloaded satellite imagery tiles. Each tile is a single image at a specific geographic coordinate and zoom level.

Column Type Constraints Description
id UUID PK AZ-503: deterministic UUIDv5 of {tile_zoom}/{tile_x}/{tile_y}/{source}/{flight_id or zero-uuid} under Uuidv5.TileNamespace. Stable across re-ingests; preserved on UPSERT conflict (AC-2 idempotence). Pre-AZ-503 rows have their original random Guid; migration 014 also copies that value into legacy_id for one-cycle forensics.
tile_zoom INT NOT NULL Google Maps zoom level (1-20)
latitude DOUBLE PRECISION NOT NULL Center latitude
longitude DOUBLE PRECISION NOT NULL Center longitude
tile_size_meters DOUBLE PRECISION NOT NULL Ground coverage in meters
tile_size_pixels INT NOT NULL Image dimension in pixels
image_type VARCHAR(10) NOT NULL Image format (e.g., "jpg")
maps_version VARCHAR(50) Legacy free-form provider tag; post-AZ-373 new rows write NULL. Vestigial post-AZ-484 (column retained for forensics on pre-existing rows; no longer part of any index)
version INT NOT NULL, DEFAULT 2025 Year-based versioning for cache invalidation. Vestigial post-AZ-484 — removed from the unique key by migration 012 (preparation for AZ-484); column retained nullable for backward compatibility
source VARCHAR(32) NOT NULL, DEFAULT 'google_maps' AZ-484: producer of the imagery ('google_maps', 'uav'). Closed value set — see tile-storage v1.0.0 contract Inv-5 and Common.Enums.TileSourceConverter. Backfilled to 'google_maps' for all pre-AZ-484 rows by migration 013
captured_at TIMESTAMP NOT NULL AZ-484: imagery acquisition timestamp (UTC). Drives most-recent-across-sources selection. Backfilled to created_at for pre-AZ-484 rows by migration 013
file_path VARCHAR(500) NOT NULL Relative path to stored image. AZ-503 per-flight UAV layout (supersedes AZ-488): source='google_maps' rows keep the legacy bucketed/timestamped path emitted by StorageConfig.GetTileFilePath ({TilesDirectory}/{zoom}/{x_bucket}/{y_bucket}/tile_{zoom}_{x}_{y}_{ts}.jpg). source='uav' rows live under {TilesDirectory}/uav/{flight_id or 'none'}/{zoom}/{x}/{y}.jpg — so rm -rf ./tiles/uav/{flight_id}/ cleanly removes one flight's evidence without disturbing other flights at overlapping cells. The authoritative source marker is the source column; the per-source / per-flight path is implementation detail that keeps both producers' bytes individually addressable.
tile_x INT NOT NULL Tile X coordinate (Slippy Map)
tile_y INT NOT NULL Tile Y coordinate (Slippy Map)
flight_id UUID NULL AZ-503: optional flight identifier. NULL for Google Maps tiles and anonymous UAV uploads; populated from UavTileMetadata.FlightId when present. Part of the UPSERT conflict key via COALESCE(flight_id, '00000000-0000-0000-0000-000000000000'::uuid), so two flights uploading the same (z, x, y) cell produce two separate rows.
location_hash UUID NOT NULL AZ-503: deterministic UUIDv5 of {tile_zoom}/{tile_x}/{tile_y} under Uuidv5.TileNamespace. Identical across flights and sources for the same cell. Backfilled in migration 014 via a pg_temp.uuidv5 PL/pgSQL function. AZ-505 made this the keyed read column for GetByTileCoordinatesAsync (leaflet hot path) and the bulk lookup column for GetTilesByLocationHashesAsync (POST /api/satellite/tiles/inventory); covered by the tiles_leaflet_path index.
content_sha256 BYTEA NULL AZ-503: SHA-256 digest of the JPEG body. Application-layer NOT NULL for new writes (enforced in TileService.BuildTileEntity + UavTileUploadHandler.PersistAsync); DB column is NULLABLE because legacy pre-migration rows cannot be backfilled reliably from disk. See batch_02_cycle5_report.md "Low maintainability finding" for the rationale.
legacy_id UUID NULL AZ-503: pre-migration id value, copied by migration 014 for one-cycle forensics. To be dropped in a future migration once the cross-repo cutover settles.
created_at TIMESTAMP NOT NULL, DEFAULT NOW
updated_at TIMESTAMP NOT NULL, DEFAULT NOW

Indexes (post-AZ-503):

  • idx_tiles_unique_identity UNIQUE (tile_zoom, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, '00000000-0000-0000-0000-000000000000'::uuid)) — created by migration 014; replaces the AZ-484 idx_tiles_unique_location_source (5-col float-based). Integer-only conflict columns eliminate float-rounding collisions; the COALESCE lets per-flight rows coexist while keeping single-row semantics for anonymous and google_maps rows.
  • tiles_leaflet_path (location_hash, captured_at DESC, updated_at DESC, id DESC) INCLUDE (file_path, source) — created by AZ-505 migration 015. Drives GET /tiles/{z}/{x}/{y} (Index Only Scan for the leaflet hot path) and the POST /api/satellite/tiles/inventory bulk lookup (leading column matches the WHERE location_hash = ANY($1::uuid[]) predicate). The lightweight idx_tiles_location_hash from migration 014 is dropped by migration 015 — equality lookups by location_hash use the leading column of the covering index, making the lookup-only index redundant.
  • idx_tiles_coordinates (tile_zoom, tile_x, tile_y, version)
  • idx_tiles_zoom (tile_zoom)

Selection rule (unchanged from AZ-484): GetByTileCoordinatesAsync and GetTilesByRegionAsync return the most-recent row across sources for any (latitude, longitude, tile_zoom, tile_size_meters) cell. Tie-break: captured_at DESC, updated_at DESC, id DESC. Region read uses DISTINCT ON to enforce one-row-per-cell at the SQL layer.

UPSERT contract (AZ-503): INSERT … ON CONFLICT (tile_zoom, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, '00000000-...'::uuid)) DO UPDATE refreshes file_path, latitude, longitude, captured_at, location_hash, content_sha256, updated_at. id is intentionally NOT overwritten on conflict, preserving AC-2 idempotence (same inputs ⇒ same id). Two sources (or two flights of the same source) at the same cell coexist as separate rows.

regions

Tracks region download requests and their processing status.

Column Type Constraints Description
id UUID PK Region request identifier
latitude DOUBLE PRECISION NOT NULL Center latitude
longitude DOUBLE PRECISION NOT NULL Center longitude
size_meters DOUBLE PRECISION NOT NULL Square region side length
zoom_level INT NOT NULL Zoom level for tiles
status VARCHAR(20) NOT NULL pending / processing / completed / failed
stitch_tiles BOOLEAN NOT NULL, DEFAULT false Whether to produce stitched image
csv_file_path VARCHAR(500) Path to tile manifest CSV
summary_file_path VARCHAR(500) Path to summary text
tiles_downloaded INT DEFAULT 0 Count of newly downloaded tiles
tiles_reused INT DEFAULT 0 Count of cache-hit tiles
created_at TIMESTAMP NOT NULL, DEFAULT NOW
updated_at TIMESTAMP NOT NULL, DEFAULT NOW

Indexes:

  • idx_regions_status (status)

routes

Defines route paths with configuration for map tile generation.

Column Type Constraints Description
id UUID PK Route identifier
name VARCHAR(200) NOT NULL Human-readable name
description TEXT Optional description
region_size_meters DOUBLE PRECISION NOT NULL Size of region per point
zoom_level INT NOT NULL Zoom level for regions
total_distance_meters DOUBLE PRECISION NOT NULL Total route length
total_points INT NOT NULL Total point count (original + interpolated)
request_maps BOOLEAN NOT NULL, DEFAULT false Whether to generate map tiles
maps_ready BOOLEAN NOT NULL, DEFAULT false Whether map generation is complete
create_tiles_zip BOOLEAN NOT NULL, DEFAULT false Whether to produce ZIP archive
tiles_zip_path VARCHAR(500) Path to output ZIP
csv_file_path VARCHAR(500) Route-level CSV
summary_file_path VARCHAR(500) Route-level summary
stitched_image_path VARCHAR(500) Route-level stitched image
created_at TIMESTAMP NOT NULL, DEFAULT NOW
updated_at TIMESTAMP NOT NULL, DEFAULT NOW

route_points

Stores all points along a route (both original waypoints and interpolated intermediate points).

Column Type Constraints Description
id UUID PK Point identifier
route_id UUID FK → routes.id, CASCADE Parent route
sequence_number INT NOT NULL, UNIQUE(route_id, seq) Order along route
latitude DOUBLE PRECISION NOT NULL Point latitude
longitude DOUBLE PRECISION NOT NULL Point longitude
point_type VARCHAR(20) NOT NULL "original" or "intermediate"
segment_index INT NOT NULL Which segment (between original points)
distance_from_previous DOUBLE PRECISION Meters from previous point
created_at TIMESTAMP NOT NULL, DEFAULT NOW

Indexes:

  • idx_route_points_route (route_id, sequence_number)
  • idx_route_points_coords (latitude, longitude)

route_regions

Junction table linking routes to their generated region requests, with geofence metadata.

Column Type Constraints Description
route_id UUID FK → routes.id, CASCADE, PK
region_id UUID FK → regions.id, CASCADE, PK
is_geofence BOOLEAN NOT NULL, DEFAULT false Whether point is inside a geofence
geofence_polygon_index INTEGER Which polygon (0-based) the point is in
created_at TIMESTAMP NOT NULL, DEFAULT NOW

Indexes:

  • idx_route_regions_route (route_id)
  • idx_route_regions_region (region_id)

Migration Strategy

  • Tool: DbUp (embedded SQL scripts)
  • Execution: Automatic on application startup (DatabaseMigrator.Migrate())
  • Naming: NNN_DescriptiveName.sql (sequential numbering)
  • Storage: Embedded resources in SatelliteProvider.DataAccess assembly
  • Tracking: DbUp's internal schemaversions table records which scripts have run
  • Rollback: Not supported — forward-only migrations

Migration History

# Migration Purpose
001 CreateTilesTable Base tiles table
002 CreateRegionsTable Region request tracking
003 CreateIndexes Performance indexes
004 AddVersionColumn Year-based tile versioning + dedup
005 CreateRoutesTables Routes, route_points, route_regions
006 AddStitchTilesToRegions Stitch flag on regions
007 AddRouteMapFields request_maps, maps_ready, file paths on routes
008 AddGeofenceFlagToRouteRegions is_geofence flag
009 AddGeofencePolygonIndex Polygon index tracking
010 AddTilesZipToRoutes ZIP generation fields
011 AddTileCoordinates Slippy map X/Y + rename zoom_level → tile_zoom
012 DropTileVersionConstraint Drops legacy 5-col (…, version) unique index; replaces with 4-col idx_tiles_unique_location (preparation for AZ-484)
013 AddTileSourceAndCapturedAt AZ-484: adds source (default 'google_maps') + captured_at columns; backfills both for pre-existing rows; replaces 4-col unique with 5-col idx_tiles_unique_location_source. Transactional; idempotent against partial replays
014 AddTileIdentityColumns AZ-503: adds flight_id (NULL), location_hash (NOT NULL after backfill), content_sha256 (NULL), legacy_id (NULL); backfills location_hash via pg_temp.uuidv5(TILE_NAMESPACE, "{tile_zoom}/{tile_x}/{tile_y}") and copies id → legacy_id for every pre-existing row; drops idx_tiles_unique_location_source (AZ-484) and creates idx_tiles_unique_identity (integer + flight-aware) + idx_tiles_location_hash. Enables pgcrypto for the in-migration SHA-1 digest. Transactional; safe to replay (column adds are IF NOT EXISTS-equivalent, backfill is idempotent on location_hash because UUIDv5 is deterministic)
015 AddTilesLeafletPathIndex AZ-505: creates tiles_leaflet_path (location_hash, captured_at DESC, updated_at DESC, id DESC) INCLUDE (file_path, source) covering index for the leaflet hot path; drops the superseded idx_tiles_location_hash from migration 014 (equality lookups by location_hash now use the leading column of the covering index). Transactional; runs inside DbUp's per-script transaction (incompatible with CREATE INDEX CONCURRENTLY) — schedule deploys to a low-traffic window on populated tables. INCLUDE columns intentionally narrow (file_path, source); inventory queries that need more columns trigger a bounded heap fetch (per AZ-505 NFR-Perf-2 budget).