mirror of
https://github.com/azaion/satellite-provider.git
synced 2026-06-21 15:11:14 +00:00
909f69cb3a
Production code:
- POST /api/satellite/tiles/inventory (XOR body, 5000-cap,
most-recent-per-location_hash select, present/absent shaping).
- Kestrel HttpProtocols.Http1AndHttp2 on every listener (AC-5).
- Migration 015 creates tiles_leaflet_path covering index over
(location_hash, captured_at DESC, updated_at DESC, id DESC)
INCLUDE (file_path, source); drops superseded idx_tiles_location_hash.
- TileRepository.GetByTileCoordinatesAsync rewired to filter by
location_hash (Index Only Scan via tiles_leaflet_path).
- TileRepository.GetTilesByLocationHashesAsync added with Npgsql-
direct ANY($1::uuid[]) binding (Dapper IEnumerable expansion is
incompatible with the array form).
- Uuidv5.LocationHashForTile centralises the UUIDv5(TileNamespace,
"{z}/{x}/{y}") formula — single source of truth for the cross-repo
invariant (gps-denied-onboard parity).
Contracts:
- New: contracts/api/tile-inventory.md v1.0.0.
- Bumped: contracts/data-access/tile-storage.md to v2.0.0 (joint
ownership by AZ-503-foundation + AZ-505: schema + covering index +
GetByTileCoordinatesAsync rewrite).
Tests:
- TileInventoryTests covers AC-1, AC-2 (DB-level), AC-4, AC-6.
- Http2MultiplexingTests covers AC-5 (20 concurrent multiplexed GETs
over h2c via SocketsHttpHandler + AppContext Http2Unencrypted switch).
- LeafletPathIndexOnlyTests covers AC-3 (EXPLAIN (ANALYZE, BUFFERS)
asserts Index Only Scan over tiles_leaflet_path with heap_blocks=0).
Docs:
- architecture.md, system-flows.md, data_model.md, module-layout.md,
glossary.md, modules/api_program.md, modules/dataaccess_tile_repository.md,
components/02_data_access/description.md all updated to reference the
v2.0.0 tile-storage contract + new tile-inventory contract + AC-7.
Reports:
- batch_01_cycle6_report.md, batch_01_cycle6_review.md,
implementation_completeness_cycle6_report.md (PASS),
implementation_report_tile_inventory_cycle6.md.
Task spec moved todo/ -> done/.
Co-authored-by: Cursor <cursoragent@cursor.com>
239 lines
15 KiB
Markdown
239 lines
15 KiB
Markdown
# Satellite Provider — Data Model
|
|
|
|
## Entity-Relationship Diagram
|
|
|
|
```mermaid
|
|
erDiagram
|
|
TILES {
|
|
uuid id PK "AZ-503: deterministic UUIDv5"
|
|
int tile_zoom
|
|
float latitude
|
|
float longitude
|
|
float tile_size_meters
|
|
int tile_size_pixels
|
|
varchar image_type
|
|
varchar maps_version
|
|
int version
|
|
varchar source
|
|
timestamp captured_at
|
|
uuid flight_id "AZ-503 nullable"
|
|
uuid location_hash "AZ-503 NOT NULL"
|
|
bytea content_sha256 "AZ-503 nullable (app-NOT-NULL for new)"
|
|
uuid legacy_id "AZ-503 nullable, pre-migration id"
|
|
varchar file_path
|
|
int tile_x
|
|
int tile_y
|
|
timestamp created_at
|
|
timestamp updated_at
|
|
}
|
|
|
|
REGIONS {
|
|
uuid id PK
|
|
float latitude
|
|
float longitude
|
|
float size_meters
|
|
int zoom_level
|
|
varchar status
|
|
bool stitch_tiles
|
|
varchar csv_file_path
|
|
varchar summary_file_path
|
|
int tiles_downloaded
|
|
int tiles_reused
|
|
timestamp created_at
|
|
timestamp updated_at
|
|
}
|
|
|
|
ROUTES {
|
|
uuid id PK
|
|
varchar name
|
|
text description
|
|
float region_size_meters
|
|
int zoom_level
|
|
float total_distance_meters
|
|
int total_points
|
|
bool request_maps
|
|
bool maps_ready
|
|
bool create_tiles_zip
|
|
varchar tiles_zip_path
|
|
varchar csv_file_path
|
|
varchar summary_file_path
|
|
varchar stitched_image_path
|
|
timestamp created_at
|
|
timestamp updated_at
|
|
}
|
|
|
|
ROUTE_POINTS {
|
|
uuid id PK
|
|
uuid route_id FK
|
|
int sequence_number
|
|
float latitude
|
|
float longitude
|
|
varchar point_type
|
|
int segment_index
|
|
float distance_from_previous
|
|
timestamp created_at
|
|
}
|
|
|
|
ROUTE_REGIONS {
|
|
uuid route_id FK
|
|
uuid region_id FK
|
|
bool is_geofence
|
|
int geofence_polygon_index
|
|
timestamp created_at
|
|
}
|
|
|
|
ROUTES ||--o{ ROUTE_POINTS : "has many"
|
|
ROUTES ||--o{ ROUTE_REGIONS : "has many"
|
|
REGIONS ||--o{ ROUTE_REGIONS : "linked via"
|
|
```
|
|
|
|
## Tables
|
|
|
|
### tiles
|
|
|
|
Stores metadata for downloaded satellite imagery tiles. Each tile is a single image at a specific geographic coordinate and zoom level.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| id | UUID | PK | AZ-503: deterministic UUIDv5 of `{tile_zoom}/{tile_x}/{tile_y}/{source}/{flight_id or zero-uuid}` under `Uuidv5.TileNamespace`. Stable across re-ingests; preserved on UPSERT conflict (AC-2 idempotence). Pre-AZ-503 rows have their original random Guid; migration 014 also copies that value into `legacy_id` for one-cycle forensics. |
|
|
| tile_zoom | INT | NOT NULL | Google Maps zoom level (1-20) |
|
|
| latitude | DOUBLE PRECISION | NOT NULL | Center latitude |
|
|
| longitude | DOUBLE PRECISION | NOT NULL | Center longitude |
|
|
| tile_size_meters | DOUBLE PRECISION | NOT NULL | Ground coverage in meters |
|
|
| tile_size_pixels | INT | NOT NULL | Image dimension in pixels |
|
|
| image_type | VARCHAR(10) | NOT NULL | Image format (e.g., "jpg") |
|
|
| maps_version | VARCHAR(50) | | Legacy free-form provider tag; post-AZ-373 new rows write NULL. Vestigial post-AZ-484 (column retained for forensics on pre-existing rows; no longer part of any index) |
|
|
| version | INT | NOT NULL, DEFAULT 2025 | Year-based versioning for cache invalidation. Vestigial post-AZ-484 — removed from the unique key by migration 012 (preparation for AZ-484); column retained nullable for backward compatibility |
|
|
| source | VARCHAR(32) | NOT NULL, DEFAULT 'google_maps' | AZ-484: producer of the imagery (`'google_maps'`, `'uav'`). Closed value set — see `tile-storage` v1.0.0 contract Inv-5 and `Common.Enums.TileSourceConverter`. Backfilled to `'google_maps'` for all pre-AZ-484 rows by migration 013 |
|
|
| captured_at | TIMESTAMP | NOT NULL | AZ-484: imagery acquisition timestamp (UTC). Drives most-recent-across-sources selection. Backfilled to `created_at` for pre-AZ-484 rows by migration 013 |
|
|
| file_path | VARCHAR(500) | NOT NULL | Relative path to stored image. **AZ-503 per-flight UAV layout** (supersedes AZ-488): `source='google_maps'` rows keep the legacy bucketed/timestamped path emitted by `StorageConfig.GetTileFilePath` (`{TilesDirectory}/{zoom}/{x_bucket}/{y_bucket}/tile_{zoom}_{x}_{y}_{ts}.jpg`). `source='uav'` rows live under `{TilesDirectory}/uav/{flight_id or 'none'}/{zoom}/{x}/{y}.jpg` — so `rm -rf ./tiles/uav/{flight_id}/` cleanly removes one flight's evidence without disturbing other flights at overlapping cells. The authoritative source marker is the `source` column; the per-source / per-flight path is implementation detail that keeps both producers' bytes individually addressable. |
|
|
| tile_x | INT | NOT NULL | Tile X coordinate (Slippy Map) |
|
|
| tile_y | INT | NOT NULL | Tile Y coordinate (Slippy Map) |
|
|
| flight_id | UUID | NULL | AZ-503: optional flight identifier. `NULL` for Google Maps tiles and anonymous UAV uploads; populated from `UavTileMetadata.FlightId` when present. Part of the UPSERT conflict key via `COALESCE(flight_id, '00000000-0000-0000-0000-000000000000'::uuid)`, so two flights uploading the same `(z, x, y)` cell produce two separate rows. |
|
|
| location_hash | UUID | NOT NULL | AZ-503: deterministic UUIDv5 of `{tile_zoom}/{tile_x}/{tile_y}` under `Uuidv5.TileNamespace`. Identical across flights and sources for the same cell. Backfilled in migration 014 via a `pg_temp.uuidv5` PL/pgSQL function. AZ-505 made this the keyed read column for `GetByTileCoordinatesAsync` (leaflet hot path) and the bulk lookup column for `GetTilesByLocationHashesAsync` (`POST /api/satellite/tiles/inventory`); covered by the `tiles_leaflet_path` index. |
|
|
| content_sha256 | BYTEA | NULL | AZ-503: SHA-256 digest of the JPEG body. Application-layer NOT NULL for new writes (enforced in `TileService.BuildTileEntity` + `UavTileUploadHandler.PersistAsync`); DB column is NULLABLE because legacy pre-migration rows cannot be backfilled reliably from disk. See `batch_02_cycle5_report.md` "Low maintainability finding" for the rationale. |
|
|
| legacy_id | UUID | NULL | AZ-503: pre-migration `id` value, copied by migration 014 for one-cycle forensics. To be dropped in a future migration once the cross-repo cutover settles. |
|
|
| created_at | TIMESTAMP | NOT NULL, DEFAULT NOW | |
|
|
| updated_at | TIMESTAMP | NOT NULL, DEFAULT NOW | |
|
|
|
|
**Indexes** (post-AZ-503):
|
|
- `idx_tiles_unique_identity` UNIQUE (tile_zoom, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, '00000000-0000-0000-0000-000000000000'::uuid)) — created by migration 014; replaces the AZ-484 `idx_tiles_unique_location_source` (5-col float-based). Integer-only conflict columns eliminate float-rounding collisions; the `COALESCE` lets per-flight rows coexist while keeping single-row semantics for anonymous and `google_maps` rows.
|
|
- `tiles_leaflet_path` (location_hash, captured_at DESC, updated_at DESC, id DESC) INCLUDE (file_path, source) — created by AZ-505 migration 015. Drives `GET /tiles/{z}/{x}/{y}` (`Index Only Scan` for the leaflet hot path) and the `POST /api/satellite/tiles/inventory` bulk lookup (leading column matches the `WHERE location_hash = ANY($1::uuid[])` predicate). The lightweight `idx_tiles_location_hash` from migration 014 is dropped by migration 015 — equality lookups by `location_hash` use the leading column of the covering index, making the lookup-only index redundant.
|
|
- `idx_tiles_coordinates` (tile_zoom, tile_x, tile_y, version)
|
|
- `idx_tiles_zoom` (tile_zoom)
|
|
|
|
**Selection rule** (unchanged from AZ-484): `GetByTileCoordinatesAsync` and `GetTilesByRegionAsync` return the most-recent row across sources for any `(latitude, longitude, tile_zoom, tile_size_meters)` cell. Tie-break: `captured_at DESC, updated_at DESC, id DESC`. Region read uses `DISTINCT ON` to enforce one-row-per-cell at the SQL layer.
|
|
|
|
**UPSERT contract** (AZ-503): `INSERT … ON CONFLICT (tile_zoom, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, '00000000-...'::uuid)) DO UPDATE` refreshes `file_path, latitude, longitude, captured_at, location_hash, content_sha256, updated_at`. `id` is intentionally NOT overwritten on conflict, preserving AC-2 idempotence (same inputs ⇒ same id). Two sources (or two flights of the same source) at the same cell coexist as separate rows.
|
|
|
|
### regions
|
|
|
|
Tracks region download requests and their processing status.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| id | UUID | PK | Region request identifier |
|
|
| latitude | DOUBLE PRECISION | NOT NULL | Center latitude |
|
|
| longitude | DOUBLE PRECISION | NOT NULL | Center longitude |
|
|
| size_meters | DOUBLE PRECISION | NOT NULL | Square region side length |
|
|
| zoom_level | INT | NOT NULL | Zoom level for tiles |
|
|
| status | VARCHAR(20) | NOT NULL | pending / processing / completed / failed |
|
|
| stitch_tiles | BOOLEAN | NOT NULL, DEFAULT false | Whether to produce stitched image |
|
|
| csv_file_path | VARCHAR(500) | | Path to tile manifest CSV |
|
|
| summary_file_path | VARCHAR(500) | | Path to summary text |
|
|
| tiles_downloaded | INT | DEFAULT 0 | Count of newly downloaded tiles |
|
|
| tiles_reused | INT | DEFAULT 0 | Count of cache-hit tiles |
|
|
| created_at | TIMESTAMP | NOT NULL, DEFAULT NOW | |
|
|
| updated_at | TIMESTAMP | NOT NULL, DEFAULT NOW | |
|
|
|
|
**Indexes**:
|
|
- `idx_regions_status` (status)
|
|
|
|
### routes
|
|
|
|
Defines route paths with configuration for map tile generation.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| id | UUID | PK | Route identifier |
|
|
| name | VARCHAR(200) | NOT NULL | Human-readable name |
|
|
| description | TEXT | | Optional description |
|
|
| region_size_meters | DOUBLE PRECISION | NOT NULL | Size of region per point |
|
|
| zoom_level | INT | NOT NULL | Zoom level for regions |
|
|
| total_distance_meters | DOUBLE PRECISION | NOT NULL | Total route length |
|
|
| total_points | INT | NOT NULL | Total point count (original + interpolated) |
|
|
| request_maps | BOOLEAN | NOT NULL, DEFAULT false | Whether to generate map tiles |
|
|
| maps_ready | BOOLEAN | NOT NULL, DEFAULT false | Whether map generation is complete |
|
|
| create_tiles_zip | BOOLEAN | NOT NULL, DEFAULT false | Whether to produce ZIP archive |
|
|
| tiles_zip_path | VARCHAR(500) | | Path to output ZIP |
|
|
| csv_file_path | VARCHAR(500) | | Route-level CSV |
|
|
| summary_file_path | VARCHAR(500) | | Route-level summary |
|
|
| stitched_image_path | VARCHAR(500) | | Route-level stitched image |
|
|
| created_at | TIMESTAMP | NOT NULL, DEFAULT NOW | |
|
|
| updated_at | TIMESTAMP | NOT NULL, DEFAULT NOW | |
|
|
|
|
### route_points
|
|
|
|
Stores all points along a route (both original waypoints and interpolated intermediate points).
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| id | UUID | PK | Point identifier |
|
|
| route_id | UUID | FK → routes.id, CASCADE | Parent route |
|
|
| sequence_number | INT | NOT NULL, UNIQUE(route_id, seq) | Order along route |
|
|
| latitude | DOUBLE PRECISION | NOT NULL | Point latitude |
|
|
| longitude | DOUBLE PRECISION | NOT NULL | Point longitude |
|
|
| point_type | VARCHAR(20) | NOT NULL | "original" or "intermediate" |
|
|
| segment_index | INT | NOT NULL | Which segment (between original points) |
|
|
| distance_from_previous | DOUBLE PRECISION | | Meters from previous point |
|
|
| created_at | TIMESTAMP | NOT NULL, DEFAULT NOW | |
|
|
|
|
**Indexes**:
|
|
- `idx_route_points_route` (route_id, sequence_number)
|
|
- `idx_route_points_coords` (latitude, longitude)
|
|
|
|
### route_regions
|
|
|
|
Junction table linking routes to their generated region requests, with geofence metadata.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------|------|-------------|-------------|
|
|
| route_id | UUID | FK → routes.id, CASCADE, PK | |
|
|
| region_id | UUID | FK → regions.id, CASCADE, PK | |
|
|
| is_geofence | BOOLEAN | NOT NULL, DEFAULT false | Whether point is inside a geofence |
|
|
| geofence_polygon_index | INTEGER | | Which polygon (0-based) the point is in |
|
|
| created_at | TIMESTAMP | NOT NULL, DEFAULT NOW | |
|
|
|
|
**Indexes**:
|
|
- `idx_route_regions_route` (route_id)
|
|
- `idx_route_regions_region` (region_id)
|
|
|
|
## Migration Strategy
|
|
|
|
- **Tool**: DbUp (embedded SQL scripts)
|
|
- **Execution**: Automatic on application startup (`DatabaseMigrator.Migrate()`)
|
|
- **Naming**: `NNN_DescriptiveName.sql` (sequential numbering)
|
|
- **Storage**: Embedded resources in `SatelliteProvider.DataAccess` assembly
|
|
- **Tracking**: DbUp's internal `schemaversions` table records which scripts have run
|
|
- **Rollback**: Not supported — forward-only migrations
|
|
|
|
## Migration History
|
|
|
|
| # | Migration | Purpose |
|
|
|---|-----------|---------|
|
|
| 001 | CreateTilesTable | Base tiles table |
|
|
| 002 | CreateRegionsTable | Region request tracking |
|
|
| 003 | CreateIndexes | Performance indexes |
|
|
| 004 | AddVersionColumn | Year-based tile versioning + dedup |
|
|
| 005 | CreateRoutesTables | Routes, route_points, route_regions |
|
|
| 006 | AddStitchTilesToRegions | Stitch flag on regions |
|
|
| 007 | AddRouteMapFields | request_maps, maps_ready, file paths on routes |
|
|
| 008 | AddGeofenceFlagToRouteRegions | is_geofence flag |
|
|
| 009 | AddGeofencePolygonIndex | Polygon index tracking |
|
|
| 010 | AddTilesZipToRoutes | ZIP generation fields |
|
|
| 011 | AddTileCoordinates | Slippy map X/Y + rename zoom_level → tile_zoom |
|
|
| 012 | DropTileVersionConstraint | Drops legacy 5-col `(…, version)` unique index; replaces with 4-col `idx_tiles_unique_location` (preparation for AZ-484) |
|
|
| 013 | AddTileSourceAndCapturedAt | AZ-484: adds `source` (default `'google_maps'`) + `captured_at` columns; backfills both for pre-existing rows; replaces 4-col unique with 5-col `idx_tiles_unique_location_source`. Transactional; idempotent against partial replays |
|
|
| 014 | AddTileIdentityColumns | AZ-503: adds `flight_id` (NULL), `location_hash` (NOT NULL after backfill), `content_sha256` (NULL), `legacy_id` (NULL); backfills `location_hash` via `pg_temp.uuidv5(TILE_NAMESPACE, "{tile_zoom}/{tile_x}/{tile_y}")` and copies `id → legacy_id` for every pre-existing row; drops `idx_tiles_unique_location_source` (AZ-484) and creates `idx_tiles_unique_identity` (integer + flight-aware) + `idx_tiles_location_hash`. Enables `pgcrypto` for the in-migration SHA-1 digest. Transactional; safe to replay (column adds are `IF NOT EXISTS`-equivalent, backfill is idempotent on `location_hash` because UUIDv5 is deterministic) |
|
|
| 015 | AddTilesLeafletPathIndex | AZ-505: creates `tiles_leaflet_path (location_hash, captured_at DESC, updated_at DESC, id DESC) INCLUDE (file_path, source)` covering index for the leaflet hot path; drops the superseded `idx_tiles_location_hash` from migration 014 (equality lookups by `location_hash` now use the leading column of the covering index). Transactional; runs inside DbUp's per-script transaction (incompatible with `CREATE INDEX CONCURRENTLY`) — schedule deploys to a low-traffic window on populated tables. INCLUDE columns intentionally narrow (`file_path, source`); inventory queries that need more columns trigger a bounded heap fetch (per AZ-505 NFR-Perf-2 budget). |
|