# C6 Postgres Schema — Tiles Table + Sector Boundaries + Migration Script **Task**: AZ-304_c6_postgres_schema **Name**: C6 Postgres Schema **Description**: Author the canonical Postgres schema for `c6_tile_cache`: `tiles` (composite key + spatial btree + LRU + voting state + onboard-ingest provenance + per-row JPEG disk size + content-hash chain), `sector_boundaries` (operator-set classification rectangles), `tile_freshness_rules` (per-flight thresholds the freshness gate reads). Ship the initial Alembic migration `_alembic/0001_initial.sql` (forward + reversible down), the schema dataclass mappings used by `PostgresFilesystemStore`, and the per-flight bootstrap migration runner that the composition root invokes at startup. **Complexity**: 2 points **Dependencies**: AZ-303_c6_storage_interfaces, AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module **Component**: c6_tile_cache (epic AZ-250 / E-C6) **Tracker**: AZ-304 **Epic**: AZ-250 (E-C6) ### Document Dependencies - `_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md` — defines the `TileMetadata` / `Bbox` / `SectorBoundary` shapes the schema must persist; defines the LRU + disk-budget contract. - `_docs/02_document/contracts/c6_tile_cache/tile_store.md` — defines the `content_sha256_hex` invariant the `tiles.content_sha256` column carries. - `_docs/02_document/contracts/shared_config/composition_root_protocol.md` — `config.tile_cache.postgres_dsn` field. - `_docs/02_document/contracts/shared_logging/log_record_schema.md` — INFO log shape on migration apply / no-op. - `_docs/02_document/data_model.md` — system-wide data model the schema must align with (`tiles`, `flight_id` provenance, `quality_metadata` JSONB shape). ## Problem Without a frozen Postgres schema: - `PostgresFilesystemStore` has nothing to insert against — `insert_metadata` cannot land any row. - `query_by_bbox` has no btree to index against — even a 1k-row corpus will table-scan, blowing the C6-PT-01 latency budget. - The composite-key uniqueness invariant from `tile_metadata_store.md` § I-1 is unenforced — duplicate-key inserts would silently corrupt the cache. - `lru_candidates` cannot order by `accessed_at` without a column; `total_disk_bytes` cannot SUM without a `disk_bytes` column. - The freshness gate (separate task) cannot read sector boundaries without a `sector_boundaries` table. - The C11 `TileUploader` cannot drive its loop off `pending_uploads()` without an `uploaded_at` column. - Re-running the companion against a stale DB has no migration runner — the operator would have to manually rebuild. This task delivers the on-disk shape that every other C6 task and every consumer depends on. It writes no Python logic beyond the Alembic env + the schema-validation helper — concrete `PostgresFilesystemStore` is a separate task. ## Outcome - A migration script at `src/gps_denied_onboard/components/c6_tile_cache/_alembic/versions/0001_initial.py` (Alembic Python migration; the project's existing Alembic env is bootstrap-task-owned per AZ-263). Forward migration `upgrade()` creates three tables and four indexes; reverse `downgrade()` drops them in reverse order. The migration is idempotent against a clean DB and is rejected (Alembic's standard behaviour) if applied to a DB at a later revision. - A migration runner `apply_migrations(config) -> MigrationResult` at `src/gps_denied_onboard/components/c6_tile_cache/migrations.py` invoked by the composition root at startup AFTER config load and BEFORE `PostgresFilesystemStore` construction. Returns `MigrationResult(applied: list[str], current_revision: str, no_op: bool)`. Logs INFO on every applied revision; logs INFO with `no_op=True` when the DB is already at head. - Three tables exist after `upgrade()`: 1. `tiles` — see Schema below. 2. `sector_boundaries` — see Schema below. 3. `tile_freshness_rules` — see Schema below. - Four indexes exist after `upgrade()`: - `tiles_pkey` — `PRIMARY KEY (zoom_level, lat, lon, source)` (composite, enforces I-1 from the metadata-store contract). - `idx_tiles_spatial` — btree over `(zoom_level, lat, lon)` for `query_by_bbox`. - `idx_tiles_pending_upload` — partial btree over `(uploaded_at) WHERE source = 'onboard_ingest' AND uploaded_at IS NULL` for `pending_uploads`. - `idx_tiles_lru` — btree over `accessed_at` for `lru_candidates`. - `quality_metadata` is JSONB (NOT a separate table) — matches description.md § 2 and `data_model.md`. The JSONB shape is validated at the application layer (the `TileQualityMetadata` dataclass). - A schema fixture `tests/fixtures/c6_postgres_schema_v1.sql` is the human-readable expected DDL used by the schema-shape test (AC-3). ## Scope ### Included - The Alembic migration `0001_initial.py` covering three tables + four indexes. - A `MigrationResult` dataclass `@dataclass(frozen=True)`. - The `apply_migrations(config)` runner using the project-pinned Alembic version (already in the bootstrap dependency set per AZ-263). - The schema-shape test (`tests/unit/c6_tile_cache/test_postgres_schema.py`) that introspects a freshly-migrated test DB and asserts the documented column types, nullable flags, default values, primary keys, and indexes (Postgres `information_schema` queries; no FAISS / no Python logic). - The `_alembic/env.py` bootstrap (registers the migration directory with the existing project Alembic env; no NEW alembic config). - The schema fixture `tests/fixtures/c6_postgres_schema_v1.sql` — copy-pastable DDL the test diffs against. - Postgres connection helper `c6_tile_cache.connection.psycopg_pool(config) -> psycopg_pool.ConnectionPool` (used by both this task's runner and the future `PostgresFilesystemStore`); the helper is a thin wrapper over `psycopg_pool.ConnectionPool` that takes the DSN from config. ### Excluded - Concrete `PostgresFilesystemStore` (insert / query / mark methods) — separate task (`c6_postgres_filesystem_store`). - The freshness gate logic that reads `sector_boundaries` / `tile_freshness_rules` — separate task (`c6_freshness_gate`). - The LRU eviction policy that reads `accessed_at` — separate task (`c6_cache_budget_eviction`). - FAISS index file format — separate task (`c6_faiss_descriptor_index`). - Sector-boundary CRUD (operator-side INSERT/UPDATE) — owned by C12. - Per-flight DB lifecycle (drop-and-rebuild between flights, freshness-rules reload) — owned by the composition root's startup orchestration; this task only applies migrations idempotently. - A second migration revision — every future schema change is a NEW migration file; this task only ships `0001_initial.py`. - Postgres tuning (work_mem, shared_buffers) — handled by the deployment / Dockerfile (E-DEPLOY); the schema is portable across reasonable Postgres 16 configurations. - Postgres-version migration (16 → 17) — out of scope this cycle; the schema MUST work on 16.x. ## Schema ### Table: `tiles` | Column | Type | Nullable | Default | Notes | |--------|------|----------|---------|-------| | `zoom_level` | `INTEGER` | NO | — | composite PK | | `lat` | `DOUBLE PRECISION` | NO | — | composite PK; centre latitude | | `lon` | `DOUBLE PRECISION` | NO | — | composite PK; centre longitude | | `source` | `TEXT` | NO | — | composite PK; CHECK `source IN ('googlemaps', 'onboard_ingest')` | | `tile_size_meters` | `DOUBLE PRECISION` | NO | — | | | `tile_size_pixels` | `INTEGER` | NO | — | | | `capture_timestamp` | `TIMESTAMPTZ` | NO | — | UTC | | `content_sha256` | `TEXT` | NO | — | 64 hex chars; matches the JPEG body hash from AZ-280's atomic-write/sidecar pattern | | `freshness_label` | `TEXT` | NO | `'fresh'` | CHECK `freshness_label IN ('fresh', 'stale_active_conflict', 'stale_rear', 'downgraded')` | | `flight_id` | `UUID` | YES | NULL | non-NULL when `source = 'onboard_ingest'` (CHECK enforces) | | `companion_id` | `TEXT` | YES | NULL | non-NULL when `source = 'onboard_ingest'` (CHECK enforces) | | `quality_metadata` | `JSONB` | YES | NULL | non-NULL when `source = 'onboard_ingest'` (CHECK enforces); shape validated app-side | | `voting_status` | `TEXT` | NO | `'trusted'` for googlemaps; `'pending'` for onboard_ingest | CHECK `voting_status IN ('pending', 'trusted', 'rejected')`; default per-source via trigger | | `disk_bytes` | `BIGINT` | NO | — | byte size of the on-disk JPEG; populated by `write_tile` | | `accessed_at` | `TIMESTAMPTZ` | NO | `now()` | LRU clock — updated by `record_lru_access` | | `uploaded_at` | `TIMESTAMPTZ` | YES | NULL | set by `mark_uploaded`; remains NULL until C11 `TileUploader` confirms post-flight upload | | `created_at` | `TIMESTAMPTZ` | NO | `now()` | row-create timestamp; immutable | Constraints: - `PRIMARY KEY (zoom_level, lat, lon, source)` - `CHECK (zoom_level BETWEEN 0 AND 21)` - `CHECK (source IN ('googlemaps', 'onboard_ingest'))` - `CHECK (freshness_label IN ('fresh', 'stale_active_conflict', 'stale_rear', 'downgraded'))` - `CHECK (voting_status IN ('pending', 'trusted', 'rejected'))` - `CHECK (disk_bytes >= 0)` - `CHECK (length(content_sha256) = 64)` - `CHECK ((source = 'onboard_ingest' AND flight_id IS NOT NULL AND companion_id IS NOT NULL AND quality_metadata IS NOT NULL) OR (source = 'googlemaps'))` ### Table: `sector_boundaries` | Column | Type | Nullable | Default | Notes | |--------|------|----------|---------|-------| | `boundary_id` | `UUID` | NO | `gen_random_uuid()` | PK | | `min_lat` | `DOUBLE PRECISION` | NO | — | | | `min_lon` | `DOUBLE PRECISION` | NO | — | | | `max_lat` | `DOUBLE PRECISION` | NO | — | | | `max_lon` | `DOUBLE PRECISION` | NO | — | | | `classification` | `TEXT` | NO | — | CHECK `classification IN ('active_conflict', 'stable_rear')` | | `set_by_operator` | `TEXT` | NO | — | operator handle for audit | | `set_at` | `TIMESTAMPTZ` | NO | `now()` | | Constraints: - `PRIMARY KEY (boundary_id)` - `CHECK (min_lat <= max_lat AND min_lon <= max_lon)` - `CHECK (classification IN ('active_conflict', 'stable_rear'))` NO spatial index this cycle — the row count is small (≤ a few hundred per flight), and the freshness gate reads them all into memory at flight start. ### Table: `tile_freshness_rules` | Column | Type | Nullable | Default | Notes | |--------|------|----------|---------|-------| | `classification` | `TEXT` | NO | — | PK; matches `sector_boundaries.classification` | | `max_age_seconds` | `BIGINT` | NO | — | seconds; per `STABLE_REAR` is the downgrade threshold; per `ACTIVE_CONFLICT` is the rejection threshold | | `action` | `TEXT` | NO | — | CHECK `action IN ('reject', 'downgrade')` | | `set_at` | `TIMESTAMPTZ` | NO | `now()` | | Constraints: - `PRIMARY KEY (classification)` - `CHECK (action IN ('reject', 'downgrade'))` - `CHECK (max_age_seconds > 0)` Default rows seeded by the migration: - `('active_conflict', 6 * 30 * 86400, 'reject')` — 6 months, AC-8.2. - `('stable_rear', 12 * 30 * 86400, 'downgrade')` — 12 months, AC-8.2. ## Acceptance Criteria **AC-1: Migration is idempotent against a clean DB** Given a fresh Postgres 16 database with no `alembic_version` row When `apply_migrations(config)` runs Then all three tables and all four indexes exist; the `alembic_version` row carries `0001_initial`; `MigrationResult.applied == ['0001_initial']`; `MigrationResult.no_op == False` **AC-2: Migration is no-op when at head** Given a Postgres DB already at `0001_initial` When `apply_migrations(config)` runs again Then `MigrationResult.applied == []`; `MigrationResult.no_op == True`; no DDL is emitted (verifiable via `pg_stat_user_tables` row counts unchanged) **AC-3: Schema shape matches the documented DDL** Given a freshly-migrated DB When the schema-shape test introspects `information_schema.columns` and `pg_indexes` Then every column matches the `Schema` section above (name, data type, nullability, default expression); every index matches (name, columns, partial-index predicate where applicable); every CHECK constraint exists with the documented expression **AC-4: Composite primary key enforces uniqueness** Given an empty `tiles` table When two INSERTs with the same `(zoom_level, lat, lon, source)` are attempted with different `content_sha256` values Then the second INSERT raises a Postgres unique-constraint violation; the first row is unaffected; the application layer translates this to `TileMetadataError` (in the `PostgresFilesystemStore` task — this task surfaces only the raw Postgres error) **AC-5: CHECK constraint enforces source-aware mandatory fields** Given an `onboard_ingest` row with `flight_id = NULL` When the INSERT is attempted Then the row is rejected by the CHECK constraint at the DB layer **AC-6: Down migration reverses cleanly** Given a DB at `0001_initial` When `alembic downgrade -1` runs (operator-only command; not exercised by the runtime) Then all three tables and all four indexes are dropped; the DB returns to the empty pre-migration state; subsequent `upgrade` re-applies cleanly **AC-7: Default freshness rules are seeded** Given a freshly-migrated DB When the schema-shape test queries `tile_freshness_rules` Then exactly two rows exist: `('active_conflict', 15552000, 'reject')` and `('stable_rear', 31104000, 'downgrade')` **AC-8: Migration runner logs INFO on apply and no-op** Given a clean DB When `apply_migrations` runs and then runs again Then the first call emits an INFO log with `kind="c6.migration.applied"` carrying `revisions=['0001_initial']`; the second call emits an INFO log with `kind="c6.migration.no_op"` **AC-9: Quality metadata JSONB is validated app-side, NOT DB-side** Given an `onboard_ingest` row with `quality_metadata = '{}'::jsonb` (empty JSONB but non-NULL) When the INSERT runs at the DB layer Then the INSERT succeeds (DB CHECK does not validate the JSONB shape); the application-layer validation (in `PostgresFilesystemStore`'s `insert_metadata`) is what would reject it. This task documents the boundary: the schema enforces presence/non-NULL only; shape is the impl task's responsibility. ## Non-Functional Requirements **Performance** - Migration apply ≤ 5 s on an empty Postgres 16 database. Schema is small (3 tables, 4 indexes) and the runner uses a single connection. - `apply_migrations` no-op call (DB at head) ≤ 100 ms. - Idempotency: re-running `apply_migrations` is bound only by the head-detection query (single SELECT against `alembic_version`). **Compatibility** - Postgres 16.x (matches `satellite-provider`'s pin per description.md § 5). - `psycopg_pool` 3.x — already pinned by AZ-263 bootstrap. - Alembic 1.13+ — already pinned by AZ-263 bootstrap. **Reliability** - The migration is wrapped in a single transaction (Alembic's default for non-DDL-batched migrations on Postgres). A crash mid-migration leaves the DB at the prior revision. - The runner catches `psycopg.errors.SerializationFailure` and retries once with exponential backoff; after the second failure, raises a `MigrationError` (NEW error type defined here, NOT in `TileCacheError` — migrations are bootstrap-time, not runtime). ## Unit Tests | AC Ref | What to Test | Required Outcome | |--------|-------------|-----------------| | AC-1 | `apply_migrations` against fresh testcontainer DB | Three tables + four indexes exist; alembic_version='0001_initial'; result.applied=['0001_initial'] | | AC-2 | `apply_migrations` against already-migrated DB | result.applied=[]; result.no_op=True; no DDL emitted | | AC-3 | Introspect information_schema after migration; diff against `tests/fixtures/c6_postgres_schema_v1.sql` | Zero diff; every column / index / CHECK matches | | AC-4 | Two INSERTs with same `(zoom, lat, lon, source)` | Second INSERT raises `psycopg.errors.UniqueViolation` | | AC-5 | INSERT `onboard_ingest` row with `flight_id=NULL` | Raises `psycopg.errors.CheckViolation` | | AC-6 | `alembic downgrade -1` then `upgrade` | DB returns to empty state then re-applies cleanly | | AC-7 | SELECT `tile_freshness_rules` after migration | Exactly 2 rows with documented values | | AC-8 | Capture log records during migration apply + no-op | Two INFO records with `kind="c6.migration.applied"` and `kind="c6.migration.no_op"` | | AC-9 | INSERT row with `quality_metadata='{}'::jsonb` | DB-layer accepts; documented as app-side responsibility | | NFR-perf-apply | Migration apply on empty 16.x | Wall ≤ 5 s | | NFR-perf-noop | `apply_migrations` no-op timing | Wall ≤ 100 ms | | NFR-reliability-retry | Inject `SerializationFailure` once, then succeed | Migration succeeds on retry; on second failure raises `MigrationError` | ## Constraints - Postgres 16.x ONLY this cycle; no SQLite / no MySQL fallback. - Alembic + `psycopg_pool` are already pinned by AZ-263; this task does NOT introduce new third-party dependencies. - The migration MUST be reversible (`downgrade` drops cleanly) — operator post-flight tooling depends on it for "drop-and-rebuild" flows. - The schema MUST mirror `data_model.md` exactly (especially the `quality_metadata` JSONB shape and the `voting_status` enum). Any deviation requires a `data_model.md` update first; this task does NOT silently extend the data model. - The `quality_metadata` JSONB shape is NOT validated at the DB layer (no domain types, no CHECK on JSON structure). That validation is `PostgresFilesystemStore.insert_metadata` (separate task) — documented in AC-9. - `gen_random_uuid()` requires the `pgcrypto` extension; the migration's `upgrade()` runs `CREATE EXTENSION IF NOT EXISTS pgcrypto` as its first statement. - `MigrationError` is NOT a member of the `TileCacheError` family — migrations run before any `c6_tile_cache.errors` consumer is constructed. - The schema-fixture file `tests/fixtures/c6_postgres_schema_v1.sql` is the diff target; updating it without a migration revision is a Spec-Gap finding (High) at code-review time. ## Risks & Mitigation **Risk 1: `quality_metadata` JSONB silently malformed** - *Risk*: An impl task writes a `quality_metadata` JSONB that doesn't match `TileQualityMetadata` shape; the DB accepts it; downstream consumers crash on read. - *Mitigation*: AC-9 documents the boundary — DB only enforces presence; shape is `insert_metadata`'s job. The future `c6_postgres_filesystem_store` task's tests cover round-trip of every documented shape. **Risk 2: Alembic version drift between dev and CI** - *Risk*: Developer pins different Alembic minor and migrations apply differently in CI. - *Mitigation*: AZ-263 bootstrap pins Alembic to a single minor; this task adds no version constraints of its own. **Risk 3: Down-migration data loss is irreversible** - *Risk*: Operator runs `alembic downgrade -1` on a DB with live data; tiles are lost. - *Mitigation*: Down-migration is documented as operator-only and destructive; the runner does NOT auto-downgrade. The composition root's startup runner only ever calls `upgrade head`. **Risk 4: Spatial-index strategy is wrong for high-zoom queries** - *Risk*: `(zoom_level, lat, lon)` btree may not be optimal for a tight bbox at zoom 21. - *Mitigation*: AC-3 fixes the index shape; if `query_by_bbox` benchmarks fail at takeoff load, a follow-up migration adds a GIST index. Not blocking this cycle (description.md notes the row count is bounded; btree is sufficient). **Risk 5: `pgcrypto` extension not available on a deployment** - *Risk*: A Tier-1 Postgres deployment ships without `pgcrypto`; `gen_random_uuid()` fails. - *Mitigation*: The migration's first statement is `CREATE EXTENSION IF NOT EXISTS pgcrypto`; if the deployment lacks the extension package, `apply_migrations` raises `MigrationError` early — surfaced to the operator at composition. ## Runtime Completeness - **Named capability**: Postgres 16 spatial metadata index + per-flight schema bootstrap + LRU/upload bookkeeping columns + sector-boundary classification table + per-classification freshness rules table (description.md / data_model.md / AC-NEW-3 / AC-NEW-6 / RESTRICT-SAT-2). - **Production code that must exist**: real Alembic migration `0001_initial.py`, real `apply_migrations` runner, real schema-fixture diff test, real `psycopg_pool` connection helper. - **Allowed external stubs**: tests use `testcontainers`-managed Postgres 16 instances (already in the project's test infra per AZ-263); production wiring uses the operator's deployed Postgres. - **Unacceptable substitutes**: SQLite "for testing only" — `production` and `test` environments MUST both be Postgres 16 (test environment as close to production as possible per coderule.mdc); raw SQL DDL applied without Alembic (would defeat the version-tracking the runner depends on); a `quality_metadata` validation at the DB layer (would lock the schema to the JSONB shape — the application-side validation is the single source of truth). ## Contract This task does NOT produce a new contract file — it implements the `tile_metadata_store.md` contract's persistence surface. The schema-fixture file `tests/fixtures/c6_postgres_schema_v1.sql` is the diff target referenced in `tile_metadata_store.md` § Test Cases (`schema-shape-fixture-diff`) — but the contract document of record stays the Protocol contract.