Files
gps-denied-onboard/_docs/02_tasks/todo/AZ-304_c6_postgres_schema.md
T
Oleksandr Bezdieniezhnykh 880eabcb3f Decompose Step 6 snapshot: 140 task specs + contract docs
Closes out greenfield Step 6 (Decompose) for all 14 components
(C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446
plus the _dependencies_table.md and component contract documents.

State file updated to greenfield Step 7 (Implement), not_started.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:39:48 +03:00

21 KiB

C6 Postgres Schema — Tiles Table + Sector Boundaries + Migration Script

Task: AZ-304_c6_postgres_schema Name: C6 Postgres Schema Description: Author the canonical Postgres schema for c6_tile_cache: tiles (composite key + spatial btree + LRU + voting state + onboard-ingest provenance + per-row JPEG disk size + content-hash chain), sector_boundaries (operator-set classification rectangles), tile_freshness_rules (per-flight thresholds the freshness gate reads). Ship the initial Alembic migration _alembic/0001_initial.sql (forward + reversible down), the schema dataclass mappings used by PostgresFilesystemStore, and the per-flight bootstrap migration runner that the composition root invokes at startup. Complexity: 2 points Dependencies: AZ-303_c6_storage_interfaces, AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module Component: c6_tile_cache (epic AZ-250 / E-C6) Tracker: AZ-304 Epic: AZ-250 (E-C6)

Document Dependencies

  • _docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md — defines the TileMetadata / Bbox / SectorBoundary shapes the schema must persist; defines the LRU + disk-budget contract.
  • _docs/02_document/contracts/c6_tile_cache/tile_store.md — defines the content_sha256_hex invariant the tiles.content_sha256 column carries.
  • _docs/02_document/contracts/shared_config/composition_root_protocol.mdconfig.tile_cache.postgres_dsn field.
  • _docs/02_document/contracts/shared_logging/log_record_schema.md — INFO log shape on migration apply / no-op.
  • _docs/02_document/data_model.md — system-wide data model the schema must align with (tiles, flight_id provenance, quality_metadata JSONB shape).

Problem

Without a frozen Postgres schema:

  • PostgresFilesystemStore has nothing to insert against — insert_metadata cannot land any row.
  • query_by_bbox has no btree to index against — even a 1k-row corpus will table-scan, blowing the C6-PT-01 latency budget.
  • The composite-key uniqueness invariant from tile_metadata_store.md § I-1 is unenforced — duplicate-key inserts would silently corrupt the cache.
  • lru_candidates cannot order by accessed_at without a column; total_disk_bytes cannot SUM without a disk_bytes column.
  • The freshness gate (separate task) cannot read sector boundaries without a sector_boundaries table.
  • The C11 TileUploader cannot drive its loop off pending_uploads() without an uploaded_at column.
  • Re-running the companion against a stale DB has no migration runner — the operator would have to manually rebuild.

This task delivers the on-disk shape that every other C6 task and every consumer depends on. It writes no Python logic beyond the Alembic env + the schema-validation helper — concrete PostgresFilesystemStore is a separate task.

Outcome

  • A migration script at src/gps_denied_onboard/components/c6_tile_cache/_alembic/versions/0001_initial.py (Alembic Python migration; the project's existing Alembic env is bootstrap-task-owned per AZ-263). Forward migration upgrade() creates three tables and four indexes; reverse downgrade() drops them in reverse order. The migration is idempotent against a clean DB and is rejected (Alembic's standard behaviour) if applied to a DB at a later revision.
  • A migration runner apply_migrations(config) -> MigrationResult at src/gps_denied_onboard/components/c6_tile_cache/migrations.py invoked by the composition root at startup AFTER config load and BEFORE PostgresFilesystemStore construction. Returns MigrationResult(applied: list[str], current_revision: str, no_op: bool). Logs INFO on every applied revision; logs INFO with no_op=True when the DB is already at head.
  • Three tables exist after upgrade():
    1. tiles — see Schema below.
    2. sector_boundaries — see Schema below.
    3. tile_freshness_rules — see Schema below.
  • Four indexes exist after upgrade():
    • tiles_pkeyPRIMARY KEY (zoom_level, lat, lon, source) (composite, enforces I-1 from the metadata-store contract).
    • idx_tiles_spatial — btree over (zoom_level, lat, lon) for query_by_bbox.
    • idx_tiles_pending_upload — partial btree over (uploaded_at) WHERE source = 'onboard_ingest' AND uploaded_at IS NULL for pending_uploads.
    • idx_tiles_lru — btree over accessed_at for lru_candidates.
  • quality_metadata is JSONB (NOT a separate table) — matches description.md § 2 and data_model.md. The JSONB shape is validated at the application layer (the TileQualityMetadata dataclass).
  • A schema fixture tests/fixtures/c6_postgres_schema_v1.sql is the human-readable expected DDL used by the schema-shape test (AC-3).

Scope

Included

  • The Alembic migration 0001_initial.py covering three tables + four indexes.
  • A MigrationResult dataclass @dataclass(frozen=True).
  • The apply_migrations(config) runner using the project-pinned Alembic version (already in the bootstrap dependency set per AZ-263).
  • The schema-shape test (tests/unit/c6_tile_cache/test_postgres_schema.py) that introspects a freshly-migrated test DB and asserts the documented column types, nullable flags, default values, primary keys, and indexes (Postgres information_schema queries; no FAISS / no Python logic).
  • The _alembic/env.py bootstrap (registers the migration directory with the existing project Alembic env; no NEW alembic config).
  • The schema fixture tests/fixtures/c6_postgres_schema_v1.sql — copy-pastable DDL the test diffs against.
  • Postgres connection helper c6_tile_cache.connection.psycopg_pool(config) -> psycopg_pool.ConnectionPool (used by both this task's runner and the future PostgresFilesystemStore); the helper is a thin wrapper over psycopg_pool.ConnectionPool that takes the DSN from config.

Excluded

  • Concrete PostgresFilesystemStore (insert / query / mark methods) — separate task (c6_postgres_filesystem_store).
  • The freshness gate logic that reads sector_boundaries / tile_freshness_rules — separate task (c6_freshness_gate).
  • The LRU eviction policy that reads accessed_at — separate task (c6_cache_budget_eviction).
  • FAISS index file format — separate task (c6_faiss_descriptor_index).
  • Sector-boundary CRUD (operator-side INSERT/UPDATE) — owned by C12.
  • Per-flight DB lifecycle (drop-and-rebuild between flights, freshness-rules reload) — owned by the composition root's startup orchestration; this task only applies migrations idempotently.
  • A second migration revision — every future schema change is a NEW migration file; this task only ships 0001_initial.py.
  • Postgres tuning (work_mem, shared_buffers) — handled by the deployment / Dockerfile (E-DEPLOY); the schema is portable across reasonable Postgres 16 configurations.
  • Postgres-version migration (16 → 17) — out of scope this cycle; the schema MUST work on 16.x.

Schema

Table: tiles

Column Type Nullable Default Notes
zoom_level INTEGER NO composite PK
lat DOUBLE PRECISION NO composite PK; centre latitude
lon DOUBLE PRECISION NO composite PK; centre longitude
source TEXT NO composite PK; CHECK source IN ('googlemaps', 'onboard_ingest')
tile_size_meters DOUBLE PRECISION NO
tile_size_pixels INTEGER NO
capture_timestamp TIMESTAMPTZ NO UTC
content_sha256 TEXT NO 64 hex chars; matches the JPEG body hash from AZ-280's atomic-write/sidecar pattern
freshness_label TEXT NO 'fresh' CHECK freshness_label IN ('fresh', 'stale_active_conflict', 'stale_rear', 'downgraded')
flight_id UUID YES NULL non-NULL when source = 'onboard_ingest' (CHECK enforces)
companion_id TEXT YES NULL non-NULL when source = 'onboard_ingest' (CHECK enforces)
quality_metadata JSONB YES NULL non-NULL when source = 'onboard_ingest' (CHECK enforces); shape validated app-side
voting_status TEXT NO 'trusted' for googlemaps; 'pending' for onboard_ingest CHECK voting_status IN ('pending', 'trusted', 'rejected'); default per-source via trigger
disk_bytes BIGINT NO byte size of the on-disk JPEG; populated by write_tile
accessed_at TIMESTAMPTZ NO now() LRU clock — updated by record_lru_access
uploaded_at TIMESTAMPTZ YES NULL set by mark_uploaded; remains NULL until C11 TileUploader confirms post-flight upload
created_at TIMESTAMPTZ NO now() row-create timestamp; immutable

Constraints:

  • PRIMARY KEY (zoom_level, lat, lon, source)
  • CHECK (zoom_level BETWEEN 0 AND 21)
  • CHECK (source IN ('googlemaps', 'onboard_ingest'))
  • CHECK (freshness_label IN ('fresh', 'stale_active_conflict', 'stale_rear', 'downgraded'))
  • CHECK (voting_status IN ('pending', 'trusted', 'rejected'))
  • CHECK (disk_bytes >= 0)
  • CHECK (length(content_sha256) = 64)
  • CHECK ((source = 'onboard_ingest' AND flight_id IS NOT NULL AND companion_id IS NOT NULL AND quality_metadata IS NOT NULL) OR (source = 'googlemaps'))

Table: sector_boundaries

Column Type Nullable Default Notes
boundary_id UUID NO gen_random_uuid() PK
min_lat DOUBLE PRECISION NO
min_lon DOUBLE PRECISION NO
max_lat DOUBLE PRECISION NO
max_lon DOUBLE PRECISION NO
classification TEXT NO CHECK classification IN ('active_conflict', 'stable_rear')
set_by_operator TEXT NO operator handle for audit
set_at TIMESTAMPTZ NO now()

Constraints:

  • PRIMARY KEY (boundary_id)
  • CHECK (min_lat <= max_lat AND min_lon <= max_lon)
  • CHECK (classification IN ('active_conflict', 'stable_rear'))

NO spatial index this cycle — the row count is small (≤ a few hundred per flight), and the freshness gate reads them all into memory at flight start.

Table: tile_freshness_rules

Column Type Nullable Default Notes
classification TEXT NO PK; matches sector_boundaries.classification
max_age_seconds BIGINT NO seconds; per STABLE_REAR is the downgrade threshold; per ACTIVE_CONFLICT is the rejection threshold
action TEXT NO CHECK action IN ('reject', 'downgrade')
set_at TIMESTAMPTZ NO now()

Constraints:

  • PRIMARY KEY (classification)
  • CHECK (action IN ('reject', 'downgrade'))
  • CHECK (max_age_seconds > 0)

Default rows seeded by the migration:

  • ('active_conflict', 6 * 30 * 86400, 'reject') — 6 months, AC-8.2.
  • ('stable_rear', 12 * 30 * 86400, 'downgrade') — 12 months, AC-8.2.

Acceptance Criteria

AC-1: Migration is idempotent against a clean DB Given a fresh Postgres 16 database with no alembic_version row When apply_migrations(config) runs Then all three tables and all four indexes exist; the alembic_version row carries 0001_initial; MigrationResult.applied == ['0001_initial']; MigrationResult.no_op == False

AC-2: Migration is no-op when at head Given a Postgres DB already at 0001_initial When apply_migrations(config) runs again Then MigrationResult.applied == []; MigrationResult.no_op == True; no DDL is emitted (verifiable via pg_stat_user_tables row counts unchanged)

AC-3: Schema shape matches the documented DDL Given a freshly-migrated DB When the schema-shape test introspects information_schema.columns and pg_indexes Then every column matches the Schema section above (name, data type, nullability, default expression); every index matches (name, columns, partial-index predicate where applicable); every CHECK constraint exists with the documented expression

AC-4: Composite primary key enforces uniqueness Given an empty tiles table When two INSERTs with the same (zoom_level, lat, lon, source) are attempted with different content_sha256 values Then the second INSERT raises a Postgres unique-constraint violation; the first row is unaffected; the application layer translates this to TileMetadataError (in the PostgresFilesystemStore task — this task surfaces only the raw Postgres error)

AC-5: CHECK constraint enforces source-aware mandatory fields Given an onboard_ingest row with flight_id = NULL When the INSERT is attempted Then the row is rejected by the CHECK constraint at the DB layer

AC-6: Down migration reverses cleanly Given a DB at 0001_initial When alembic downgrade -1 runs (operator-only command; not exercised by the runtime) Then all three tables and all four indexes are dropped; the DB returns to the empty pre-migration state; subsequent upgrade re-applies cleanly

AC-7: Default freshness rules are seeded Given a freshly-migrated DB When the schema-shape test queries tile_freshness_rules Then exactly two rows exist: ('active_conflict', 15552000, 'reject') and ('stable_rear', 31104000, 'downgrade')

AC-8: Migration runner logs INFO on apply and no-op Given a clean DB When apply_migrations runs and then runs again Then the first call emits an INFO log with kind="c6.migration.applied" carrying revisions=['0001_initial']; the second call emits an INFO log with kind="c6.migration.no_op"

AC-9: Quality metadata JSONB is validated app-side, NOT DB-side Given an onboard_ingest row with quality_metadata = '{}'::jsonb (empty JSONB but non-NULL) When the INSERT runs at the DB layer Then the INSERT succeeds (DB CHECK does not validate the JSONB shape); the application-layer validation (in PostgresFilesystemStore's insert_metadata) is what would reject it. This task documents the boundary: the schema enforces presence/non-NULL only; shape is the impl task's responsibility.

Non-Functional Requirements

Performance

  • Migration apply ≤ 5 s on an empty Postgres 16 database. Schema is small (3 tables, 4 indexes) and the runner uses a single connection.
  • apply_migrations no-op call (DB at head) ≤ 100 ms.
  • Idempotency: re-running apply_migrations is bound only by the head-detection query (single SELECT against alembic_version).

Compatibility

  • Postgres 16.x (matches satellite-provider's pin per description.md § 5).
  • psycopg_pool 3.x — already pinned by AZ-263 bootstrap.
  • Alembic 1.13+ — already pinned by AZ-263 bootstrap.

Reliability

  • The migration is wrapped in a single transaction (Alembic's default for non-DDL-batched migrations on Postgres). A crash mid-migration leaves the DB at the prior revision.
  • The runner catches psycopg.errors.SerializationFailure and retries once with exponential backoff; after the second failure, raises a MigrationError (NEW error type defined here, NOT in TileCacheError — migrations are bootstrap-time, not runtime).

Unit Tests

AC Ref What to Test Required Outcome
AC-1 apply_migrations against fresh testcontainer DB Three tables + four indexes exist; alembic_version='0001_initial'; result.applied=['0001_initial']
AC-2 apply_migrations against already-migrated DB result.applied=[]; result.no_op=True; no DDL emitted
AC-3 Introspect information_schema after migration; diff against tests/fixtures/c6_postgres_schema_v1.sql Zero diff; every column / index / CHECK matches
AC-4 Two INSERTs with same (zoom, lat, lon, source) Second INSERT raises psycopg.errors.UniqueViolation
AC-5 INSERT onboard_ingest row with flight_id=NULL Raises psycopg.errors.CheckViolation
AC-6 alembic downgrade -1 then upgrade DB returns to empty state then re-applies cleanly
AC-7 SELECT tile_freshness_rules after migration Exactly 2 rows with documented values
AC-8 Capture log records during migration apply + no-op Two INFO records with kind="c6.migration.applied" and kind="c6.migration.no_op"
AC-9 INSERT row with quality_metadata='{}'::jsonb DB-layer accepts; documented as app-side responsibility
NFR-perf-apply Migration apply on empty 16.x Wall ≤ 5 s
NFR-perf-noop apply_migrations no-op timing Wall ≤ 100 ms
NFR-reliability-retry Inject SerializationFailure once, then succeed Migration succeeds on retry; on second failure raises MigrationError

Constraints

  • Postgres 16.x ONLY this cycle; no SQLite / no MySQL fallback.
  • Alembic + psycopg_pool are already pinned by AZ-263; this task does NOT introduce new third-party dependencies.
  • The migration MUST be reversible (downgrade drops cleanly) — operator post-flight tooling depends on it for "drop-and-rebuild" flows.
  • The schema MUST mirror data_model.md exactly (especially the quality_metadata JSONB shape and the voting_status enum). Any deviation requires a data_model.md update first; this task does NOT silently extend the data model.
  • The quality_metadata JSONB shape is NOT validated at the DB layer (no domain types, no CHECK on JSON structure). That validation is PostgresFilesystemStore.insert_metadata (separate task) — documented in AC-9.
  • gen_random_uuid() requires the pgcrypto extension; the migration's upgrade() runs CREATE EXTENSION IF NOT EXISTS pgcrypto as its first statement.
  • MigrationError is NOT a member of the TileCacheError family — migrations run before any c6_tile_cache.errors consumer is constructed.
  • The schema-fixture file tests/fixtures/c6_postgres_schema_v1.sql is the diff target; updating it without a migration revision is a Spec-Gap finding (High) at code-review time.

Risks & Mitigation

Risk 1: quality_metadata JSONB silently malformed

  • Risk: An impl task writes a quality_metadata JSONB that doesn't match TileQualityMetadata shape; the DB accepts it; downstream consumers crash on read.
  • Mitigation: AC-9 documents the boundary — DB only enforces presence; shape is insert_metadata's job. The future c6_postgres_filesystem_store task's tests cover round-trip of every documented shape.

Risk 2: Alembic version drift between dev and CI

  • Risk: Developer pins different Alembic minor and migrations apply differently in CI.
  • Mitigation: AZ-263 bootstrap pins Alembic to a single minor; this task adds no version constraints of its own.

Risk 3: Down-migration data loss is irreversible

  • Risk: Operator runs alembic downgrade -1 on a DB with live data; tiles are lost.
  • Mitigation: Down-migration is documented as operator-only and destructive; the runner does NOT auto-downgrade. The composition root's startup runner only ever calls upgrade head.

Risk 4: Spatial-index strategy is wrong for high-zoom queries

  • Risk: (zoom_level, lat, lon) btree may not be optimal for a tight bbox at zoom 21.
  • Mitigation: AC-3 fixes the index shape; if query_by_bbox benchmarks fail at takeoff load, a follow-up migration adds a GIST index. Not blocking this cycle (description.md notes the row count is bounded; btree is sufficient).

Risk 5: pgcrypto extension not available on a deployment

  • Risk: A Tier-1 Postgres deployment ships without pgcrypto; gen_random_uuid() fails.
  • Mitigation: The migration's first statement is CREATE EXTENSION IF NOT EXISTS pgcrypto; if the deployment lacks the extension package, apply_migrations raises MigrationError early — surfaced to the operator at composition.

Runtime Completeness

  • Named capability: Postgres 16 spatial metadata index + per-flight schema bootstrap + LRU/upload bookkeeping columns + sector-boundary classification table + per-classification freshness rules table (description.md / data_model.md / AC-NEW-3 / AC-NEW-6 / RESTRICT-SAT-2).
  • Production code that must exist: real Alembic migration 0001_initial.py, real apply_migrations runner, real schema-fixture diff test, real psycopg_pool connection helper.
  • Allowed external stubs: tests use testcontainers-managed Postgres 16 instances (already in the project's test infra per AZ-263); production wiring uses the operator's deployed Postgres.
  • Unacceptable substitutes: SQLite "for testing only" — production and test environments MUST both be Postgres 16 (test environment as close to production as possible per coderule.mdc); raw SQL DDL applied without Alembic (would defeat the version-tracking the runner depends on); a quality_metadata validation at the DB layer (would lock the schema to the JSONB shape — the application-side validation is the single source of truth).

Contract

This task does NOT produce a new contract file — it implements the tile_metadata_store.md contract's persistence surface. The schema-fixture file tests/fixtures/c6_postgres_schema_v1.sql is the diff target referenced in tile_metadata_store.md § Test Cases (schema-shape-fixture-diff) — but the contract document of record stays the Protocol contract.