diff --git a/_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md b/_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md index ea4dbf5..8aaadae 100644 --- a/_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md +++ b/_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md @@ -7,7 +7,7 @@ - AZ-TBD-c6-freshness-gate (insert hook + sector classification reader) - AZ-TBD-c6-cache-budget-eviction (LRU candidate enumeration + delete coordination) - TBD at decompose time: E-C10 (AZ-252 — manifest + provisioning), E-C11 (AZ-251 — both `TileDownloader` insert and `TileUploader` reader queries), E-C12 (AZ-253 — operator pre-flight tooling) -**Version**: 1.1.0 +**Version**: 1.2.0 **Status**: draft **Last Updated**: 2026-05-12 @@ -63,6 +63,17 @@ class TileMetadataPersistent: The Protocol returns `TileMetadata` from queries. `TileMetadataPersistent` is the in-process view of LRU and disk-budget state, accessible only via `lru_candidates` / `record_lru_access` / `total_disk_bytes`. +#### TileMetadata.location_hash (v1.2.0) + +```python +@dataclass(frozen=True) +class TileMetadata: + # ...existing AZ-303 v1.1.0 fields unchanged... + location_hash: UUID | None = None # uuidv5(TILE_NAMESPACE_UUID, "{zoom}/{tile_x}/{tile_y}") +``` + +`location_hash` is a deterministic per-cell-bag identifier (UUIDv5, namespace-pinned in `c6_tile_cache._uuid_namespace.TILE_NAMESPACE_UUID`) shared by every row at the same `(zoom_level, tile_x, tile_y)` regardless of source or flight (Scenario 1 UI lookup, Scenario 6 voting query of the 2026-05-12 tile-schema scenario analysis). Defaults to `None` so AZ-303-era constructors continue to work; `PostgresFilesystemStore.insert_metadata` derives the value via `derive_location_hash(zoom_level, tile_x, tile_y)` when `None`, and the DB-side NOT-NULL constraint is the safety net. Cross-repo coordinated with `satellite-provider` per `AZ-TBD_tile_identity_uuidv5_bulk_list`. + ### Sector classification (read-only input to the freshness gate) ```python @@ -77,7 +88,7 @@ class SectorBoundary: classification: SectorClassification ``` -`SectorClassification` is set pre-flight by the operator via C12; the metadata store reads `SectorBoundary` rows from a sibling table (`sector_boundaries`) at insert-time to decide which freshness rule to apply. The Protocol does NOT expose insert-side methods for `SectorBoundary` rows — that surface lives in C12. +`SectorClassification` is set pre-flight by the operator via C12; the metadata store reads `SectorBoundary` rows from the sibling table `sector_classifications` (per the AZ-263 baseline schema; AZ-304 adds the NULLable `min_lat` / `min_lon` / `max_lat` / `max_lon` bbox columns operators populate) at insert-time to decide which freshness rule to apply. The Protocol does NOT expose insert-side methods for `SectorBoundary` rows — that surface lives in C12. ## Invariants @@ -98,7 +109,7 @@ class SectorBoundary: - **Not covered: sector boundary insert / update.** Owned by C12 operator-tooling against a sibling table; this Protocol is read-only on `SectorBoundary` and does NOT expose CRUD. - **Not covered: cross-flight aggregation / voting threshold computation.** That's `satellite-provider`'s D-PROJ-2 trust layer (parent suite); C6 just stamps the per-row `voting_status`. - **Not covered: full-text search / arbitrary-WHERE queries.** Only the methods above; ad-hoc queries go through DBA tooling, not this Protocol. -- **Not covered: schema migrations.** Migration scripts live in `c6_tile_cache/_alembic/`; the Protocol is shape-only. +- **Not covered: schema migrations.** Migration scripts live in `db/migrations/versions/` (project-level Alembic env owned by c6_tile_cache per `module-layout.md`; `0001_initial.py` shipped by AZ-263, `0002_c6_tile_identity_and_lru.py` by AZ-304); the Protocol is shape-only. ## Versioning Rules @@ -132,3 +143,4 @@ Same rules as `tile_store.md` § Versioning Rules. |---------|------|--------|--------| | 1.0.0 | 2026-05-10 | Initial contract — 9-method Protocol + LRU/disk-budget extensions + freshness gate semantics + composite-key uniqueness invariant. | autodev (decompose Step 2 of AZ-250 / E-C6) | | 1.1.0 | 2026-05-12 | Non-breaking refinement of Invariant I-1: natural key switched from `(zoom_level, lat, lon, source)` (float-based) to `(zoom_level, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, zero_uuid))` (integer + per-flight separated). Protocol surface unchanged; consumers gain the ability to observe multiple ONBOARD_INGEST rows for the same cell from different flights (required by D-PROJ-2 voting). Driven by `_docs/_process_leftovers/2026-05-12_tile-schema-scenario-analysis.md` and the cross-workspace satellite-provider task `AZ-TBD_tile_identity_uuidv5_bulk_list`. | autodev (AZ-304 batch 27 of cycle 1) | +| 1.2.0 | 2026-05-12 | Non-breaking addition of `TileMetadata.location_hash: UUID \| None = None` (cross-source/cross-flight cell-bag identifier; UUIDv5 over `(zoom, tile_x, tile_y)`). Corrected stale references: sector table name (`sector_boundaries` → `sector_classifications`) and Alembic env path (`c6_tile_cache/_alembic/` → `db/migrations/versions/`). Protocol surface unchanged; existing constructors continue to work because the field defaults to `None`. Shipped by AZ-304 alongside the additive `0002_c6_tile_identity_and_lru` migration. | autodev (AZ-304 batch 27 of cycle 1) | diff --git a/_docs/02_document/module-layout.md b/_docs/02_document/module-layout.md index 8086db8..29418da 100644 --- a/_docs/02_document/module-layout.md +++ b/_docs/02_document/module-layout.md @@ -143,8 +143,10 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec - **Internal**: - `postgres_filesystem_store.py` (Postgres mirror + filesystem mmap + FAISS HNSW; production-default) - `_native/` (`cpp/faiss_index/` wrapper) - - `_alembic/` (migration scripts; `0001_initial.sql` shipped in bootstrap) -- **Owns**: `src/gps_denied_onboard/components/c6_tile_cache/**`, `cpp/faiss_index/**`, `tests/unit/c6_tile_cache/**` + - `migrations.py` (`apply_migrations(config) -> MigrationResult` runner invoked by the composition root at startup; AZ-304 + later) + - `_uuid_namespace.py` (pinned `TILE_NAMESPACE_UUID` + `derive_tile_id` / `derive_location_hash` helpers; cross-repo coordinated with `satellite-provider`; AZ-304) + - `connection.py` (`psycopg_pool` ConnectionPool helper; AZ-304) +- **Owns**: `src/gps_denied_onboard/components/c6_tile_cache/**`, `cpp/faiss_index/**`, `tests/unit/c6_tile_cache/**`, `db/migrations/**` (project-level Alembic env owned by c6 — `alembic.ini` at repo root points here; `0001_initial.py` shipped by AZ-263 bootstrap, `0002_c6_tile_identity_and_lru.py` and forward owned by AZ-304+ migrations) - **Imports from**: `_types`, `helpers.sha256_sidecar`, `helpers.wgs_converter`, `config`, `logging`, `fdr_client` - **Consumed by**: `c2_vpr`, `c2_5_rerank`, `c3_matcher`, `c10_provisioning`, `c11_tile_manager`, `runtime_root` diff --git a/_docs/02_tasks/todo/AZ-304_c6_postgres_schema.md b/_docs/02_tasks/done/AZ-304_c6_postgres_schema.md similarity index 90% rename from _docs/02_tasks/todo/AZ-304_c6_postgres_schema.md rename to _docs/02_tasks/done/AZ-304_c6_postgres_schema.md index 7d5dbad..78462bd 100644 --- a/_docs/02_tasks/todo/AZ-304_c6_postgres_schema.md +++ b/_docs/02_tasks/done/AZ-304_c6_postgres_schema.md @@ -40,7 +40,7 @@ This task delivers the strictly-additive `0002` migration that closes those gaps - An Alembic migration script at `db/migrations/versions/0002_c6_tile_identity_and_lru.py` (forward `upgrade()` is purely additive; reverse `downgrade()` drops the additions and restores the original AZ-263 `freshness_status` CHECK). The migration is idempotent against a DB at AZ-263 head; Alembic rejects double-application via the standard `alembic_version` row. - The migration runner `apply_migrations(config) -> MigrationResult` at `src/gps_denied_onboard/components/c6_tile_cache/migrations.py`, invoked by the composition root at startup AFTER config load and BEFORE `PostgresFilesystemStore` construction. Returns `MigrationResult(applied: list[str], current_revision: str, no_op: bool)`. Logs INFO on every applied revision; logs INFO with `no_op=True` when the DB is already at head. - The pinned UUIDv5 namespace module at `src/gps_denied_onboard/components/c6_tile_cache/_uuid_namespace.py` exporting `TILE_NAMESPACE_UUID = UUID("5b8d0c2e-1a4f-4b3a-8c9d-e7f6a3b2c1d0")`, `derive_tile_id(zoom_level, tile_x, tile_y, source, flight_id) -> UUID`, and `derive_location_hash(zoom_level, tile_x, tile_y) -> UUID`. The namespace value is cross-repo coordinated with `satellite-provider/SatelliteProvider.Common/Utils/Uuidv5.cs` per `AZ-TBD_tile_identity_uuidv5_bulk_list`; the same `uuidv5(NAMESPACE, name)` MUST produce byte-identical output on both sides. -- The psycopg_pool connection helper at `src/gps_denied_onboard/components/c6_tile_cache/connection.py` (`psycopg_pool(config) -> psycopg_pool.ConnectionPool`), used by both this task's runner and the future `PostgresFilesystemStore`. +- **No `psycopg_pool` helper ships in this task.** The migration runner relies on Alembic's existing SQLAlchemy engine (`engine_from_config(..., poolclass=pool.NullPool)`) already wired in `db/migrations/env.py` by AZ-263. `psycopg_pool` is NOT pinned by AZ-263 (only `psycopg[binary]>=3.1` is), so a runtime connection-pool helper (`PostgresFilesystemStore` use-case) introduces a new dependency and is the responsibility of AZ-305 (`c6_postgres_filesystem_store`). - After `apply_migrations(config)` on an AZ-263-baselined DB: - The `tiles` table has six additional columns (`tile_uuid`, `location_hash`, `content_sha256`, `disk_bytes`, `accessed_at`, `uploaded_at`); the AZ-263 columns are unchanged. - The `tiles` table has a new UNIQUE index `idx_tiles_natural_key` over the COALESCE-zero-uuid natural key; the AZ-263 indices are unchanged. @@ -50,7 +50,7 @@ This task delivers the strictly-additive `0002` migration that closes those gaps - A new `tile_freshness_rules` table exists, seeded with the two default rows. - The AZ-303 contract `tile_metadata_store.md` is bumped v1.1.0 → v1.2.0 with the `location_hash: UUID | None` field added to the documented `TileMetadata` shape (non-breaking minor — Optional default `None`, populated by `PostgresFilesystemStore.insert_metadata` from `_uuid_namespace.derive_location_hash` when not supplied). - The DTO `TileMetadata` in `src/gps_denied_onboard/components/c6_tile_cache/_types.py` gains the same `location_hash: UUID | None = None` field (positional last, default value preserves existing constructor call sites). -- A schema fixture `tests/fixtures/c6_postgres_schema_v2.sql` is the human-readable expected post-0002 DDL used by the schema-shape diff test. +- A schema fixture `tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql` is the human-readable expected post-0002 DDL used by the schema-shape diff test (kept inside c6_tile_cache's owned test directory). ## Scope @@ -60,11 +60,10 @@ This task delivers the strictly-additive `0002` migration that closes those gaps - A `MigrationResult` dataclass `@dataclass(frozen=True)` at `c6_tile_cache.migrations`. - The `apply_migrations(config) -> MigrationResult` runner using the existing project-pinned Alembic env at `db/migrations/` (no new alembic.ini, no new env.py — AZ-263 bootstrap owns those; this task only wires `target_metadata` into `db/migrations/env.py` so future autogenerate diffs work). - The pinned UUIDv5 namespace module `_uuid_namespace.py` with `TILE_NAMESPACE_UUID`, `derive_tile_id`, `derive_location_hash`. No Postgres dependency; pure stdlib `uuid.uuid5`. -- The schema-shape diff test `tests/unit/c6_tile_cache/test_postgres_schema.py` that introspects a freshly-migrated test DB (Alembic upgraded to `0002` head) and asserts every column, index, CHECK constraint, and seed row matches `tests/fixtures/c6_postgres_schema_v2.sql`. Test uses `testcontainers`-managed Postgres 16; no Python logic beyond `information_schema` queries. +- The schema-shape diff test `tests/unit/c6_tile_cache/test_postgres_schema.py` that introspects a freshly-migrated test DB (Alembic upgraded to `0002` head) and asserts every column, index, CHECK constraint, and seed row matches `tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql`. Test is `@pytest.mark.docker` (auto-skipped on Tier-1 per `tests/conftest.py`); it consumes the `db` service from `docker-compose.test.yml` via the `DB_URL` env var. No Python logic beyond `information_schema` queries. - The UUIDv5 determinism test `tests/unit/c6_tile_cache/test_uuid_namespace.py` that locks `TILE_NAMESPACE_UUID` and verifies ≥5 fixed `(z, x, y, source, flight_id)` input vectors produce the documented UUIDv5 outputs. These vectors are the cross-repo coordination evidence — the corresponding `satellite-provider` test MUST produce byte-identical UUIDs. -- The connection helper `connection.psycopg_pool(config) -> psycopg_pool.ConnectionPool`. -- Wiring of `db/migrations/env.py` `target_metadata` to a `c6_tile_cache.metadata` SQLAlchemy `MetaData` object that reflects both AZ-263 and AZ-304 schema (so future autogenerate diffs are mechanically comparable). -- The schema-fixture file `tests/fixtures/c6_postgres_schema_v2.sql` — copy-pastable DDL the test diffs against. +- The `db/migrations/env.py` minimum-touch policy: AZ-263 already wires `engine_from_config` + DB_URL fallback for online mode; this task does NOT add `target_metadata` (we use Alembic `op.*` directly, never `autogenerate`). A `target_metadata` wiring lands when the first task adds SQLAlchemy ORM models (none in this task or AZ-305 as currently scoped). +- The schema-fixture file `tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql` — copy-pastable DDL the test diffs against (lives inside c6_tile_cache's owned test glob per `module-layout.md`). - DTO extension in `_types.py`: `TileMetadata.location_hash: UUID | None = None` (positional last, default `None`). - Contract bump `_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md` v1.1.0 → v1.2.0 with a Change Log entry. @@ -189,7 +188,7 @@ Then `MigrationResult.applied == []`; `MigrationResult.no_op == True`; no DDL is **AC-3: Schema shape matches the documented DDL** Given a DB upgraded through 0001 + 0002 When the schema-shape diff test introspects `information_schema.columns` / `pg_indexes` / `pg_constraint` / `tile_freshness_rules` row contents -Then every AZ-263 column / index / CHECK is present and unchanged; every additive column / index / CHECK from this task is present; the `ck_tiles_freshness_status` CHECK contains the UNION vocabulary; `tile_freshness_rules` has exactly two seeded rows with the documented values. Diff against `tests/fixtures/c6_postgres_schema_v2.sql` is empty. +Then every AZ-263 column / index / CHECK is present and unchanged; every additive column / index / CHECK from this task is present; the `ck_tiles_freshness_status` CHECK contains the UNION vocabulary; `tile_freshness_rules` has exactly two seeded rows with the documented values. Diff against `tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql` is empty. **AC-4: Natural-key uniqueness enforces per-flight separation** Given a `tiles` table after 0002 @@ -250,8 +249,7 @@ Then construction succeeds with `location_hash = None`; when constructors supply **Compatibility** - Postgres 16.x. -- `psycopg_pool` 3.x — pinned by AZ-263; this task adds no new third-party dependencies. -- Alembic 1.13+ — pinned by AZ-263. +- `psycopg[binary]>=3.1` and `alembic>=1.13` — both pinned by AZ-263. This task adds **no** new third-party dependencies. `psycopg_pool` is explicitly deferred to AZ-305 since AZ-263 did not pin it. - Cross-workspace UUIDv5 namespace: `TILE_NAMESPACE_UUID` MUST be byte-identical to the satellite-provider C# constant; any change requires a coordinated cross-repo release. **Reliability** @@ -262,9 +260,9 @@ Then construction succeeds with `location_hash = None`; when constructors supply | AC Ref | What to Test | Required Outcome | |--------|-------------|-----------------| -| AC-1 | `apply_migrations` against fresh `testcontainer` DB previously upgraded to AZ-263 head | All additive columns / indices / `tile_freshness_rules` table exist; `alembic_version='0002_c6_tile_identity_and_lru'`; `result.applied=['0002_c6_tile_identity_and_lru']`. AZ-263 columns / indices / CHECKs byte-identical to pre-migration snapshot. | +| AC-1 | `apply_migrations` against the `db` Postgres service (docker-compose.test.yml) previously upgraded to AZ-263 head | All additive columns / indices / `tile_freshness_rules` table exist; `alembic_version='0002_c6_tile_identity_and_lru'`; `result.applied=['0002_c6_tile_identity_and_lru']`. AZ-263 columns / indices / CHECKs byte-identical to pre-migration snapshot. | | AC-2 | `apply_migrations` against DB already at 0002 head | `result.applied=[]`; `result.no_op=True`; no DDL emitted. | -| AC-3 | Introspect `information_schema` / `pg_indexes` / `pg_constraint` / `tile_freshness_rules` rows; diff against `tests/fixtures/c6_postgres_schema_v2.sql` | Zero diff. | +| AC-3 | Introspect `information_schema` / `pg_indexes` / `pg_constraint` / `tile_freshness_rules` rows; diff against `tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql` | Zero diff. | | AC-4 | Two `onboard_ingest` INSERTs with same `(z, tile_x, tile_y, tile_size_meters)` and different `flight_id` | Both succeed; same `location_hash`; different `tile_uuid`. | | AC-4b | Two INSERTs with identical natural-key tuple (same `flight_id` or both NULL for `googlemaps`) | Second INSERT raises `psycopg.errors.UniqueViolation`. | | AC-5 | INSERT one row each for `'fresh'`, `'stale_warn'`, `'stale_reject'`, `'stale_active_conflict'`, `'stale_rear'`, `'downgraded'`; INSERT one row with `'bogus'` | First six succeed; last raises `psycopg.errors.CheckViolation`. | @@ -282,12 +280,12 @@ Then construction succeeds with `location_hash = None`; when constructors supply ## Constraints - Postgres 16.x ONLY this cycle; no SQLite / no MySQL fallback. -- Alembic + `psycopg_pool` are pinned by AZ-263; this task does NOT introduce new third-party dependencies. +- Alembic + `psycopg[binary]` are pinned by AZ-263; this task does NOT introduce new third-party dependencies. `psycopg_pool` is NOT introduced here (deferred to AZ-305). - The migration MUST be reversible (`downgrade` drops the additions cleanly and restores the AZ-263 CHECK) — operator post-flight tooling depends on it for "drop-and-rebuild" flows. - The migration MUST be strictly additive on every AZ-263 column, table, index, and CHECK per `data_model.md` § 6.1 / § 6.3 and `coderule.mdc`. The single allowed constraint mutation is the `ck_tiles_freshness_status` CHECK widening, which is additive in semantic effect. - `pgcrypto` extension is NOT required by 0002 (no `gen_random_uuid()` use); `tile_freshness_rules.classification` is a TEXT PK and the seeded rows are static. - `MigrationError` is NOT a member of the `TileCacheError` family — migrations run before any `c6_tile_cache.errors` consumer is constructed. -- The schema-fixture file `tests/fixtures/c6_postgres_schema_v2.sql` is the diff target; updating it without a new migration revision is a Spec-Gap finding (High) at code-review time. +- The schema-fixture file `tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql` is the diff target; updating it without a new migration revision is a Spec-Gap finding (High) at code-review time. - The pinned `TILE_NAMESPACE_UUID` MUST NOT be regenerated by this task. The value `5b8d0c2e-1a4f-4b3a-8c9d-e7f6a3b2c1d0` is locked here; subsequent edits require a coordinated cross-workspace release. - The `latitude` / `longitude` columns (AZ-263 names) remain advisory; `lat` / `lon` are NOT introduced. The DTO `TileId(zoom_level, lat, lon)` maps via `PostgresFilesystemStore` serialisation; the schema is NOT changed to match the DTO field name. - The `freshness_status` column name (AZ-263) is NOT renamed to `freshness_label`. DTO field `freshness_label: FreshnessLabel` maps via `PostgresFilesystemStore` serialisation. @@ -333,12 +331,12 @@ Then construction succeeds with `location_hash = None`; when constructors supply ## Runtime Completeness - **Named capability**: Postgres 16 deterministic-identity columns + per-flight natural-key UNIQUE + LRU/upload bookkeeping + sector-boundary geometry + per-classification freshness rules — all additive on the AZ-263 baseline (description.md / data_model.md / AC-NEW-3 / AC-NEW-6 / RESTRICT-SAT-2 / 2026-05-12 leftover § 4–5). -- **Production code that must exist**: real Alembic migration `0002_c6_tile_identity_and_lru.py`, real `apply_migrations` runner, real schema-fixture diff test, real `psycopg_pool` connection helper, real `_uuid_namespace` module with `TILE_NAMESPACE_UUID` constant and `derive_tile_id` / `derive_location_hash` helpers, real DTO extension in `_types.py`, real contract bump in `tile_metadata_store.md`. -- **Allowed external stubs**: tests use `testcontainers`-managed Postgres 16 instances (already in the project's test infra per AZ-263); production wiring uses the operator's deployed Postgres. +- **Production code that must exist**: real Alembic migration `0002_c6_tile_identity_and_lru.py`, real `apply_migrations` runner driving Alembic's `command.upgrade(cfg, "head")` against the AZ-263 env, real schema-fixture diff test, real `_uuid_namespace` module with `TILE_NAMESPACE_UUID` constant and `derive_tile_id` / `derive_location_hash` helpers, real DTO extension in `_types.py`, real contract bump in `tile_metadata_store.md`. **NOT included** in this task: a `psycopg_pool` connection helper (deferred to AZ-305). +- **Allowed external stubs**: tests requiring a real Postgres are marked `@pytest.mark.docker` and run against the `db` service from `docker-compose.test.yml` (per the existing AZ-263 test-infra convention — see `tests/conftest.py` auto-skip logic). The schema-shape test reads `DB_URL` from the env at run time; on Tier-1 (`GPS_DENIED_TIER != 2`) the test auto-skips. Production wiring uses the operator's deployed Postgres. - **Unacceptable substitutes**: SQLite "for testing only" — production and test environments MUST both be Postgres 16; raw SQL DDL applied without Alembic (would defeat the version-tracking the runner depends on); a `tile_quality_metadata` validation at the DB layer (would lock the schema to the JSONB shape — the application-side validation is the single source of truth); a non-deterministic `tile_uuid` strategy (would defeat the cross-workspace coordination the namespace pin establishes); any operation that renames, retypes, or drops an AZ-263 column / table / index / CHECK (forbidden per `coderule.mdc` and `data_model.md` § 6.2 / § 6.3); a parallel Alembic env at `src/.../c6_tile_cache/_alembic/` (forbidden — the project uses one alembic env at `db/migrations/` per AZ-263 + `alembic.ini`). ## Contract This task does NOT produce a new contract file — it implements the `tile_metadata_store.md` contract's persistence surface and bumps its version v1.1.0 → v1.2.0 with one non-breaking minor addition (`TileMetadata.location_hash: UUID | None = None`). -The schema-fixture file `tests/fixtures/c6_postgres_schema_v2.sql` is the diff target referenced in `tile_metadata_store.md` § Test Cases (`schema-shape-fixture-diff`) — but the contract document of record stays the Protocol contract. +The schema-fixture file `tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql` is the diff target referenced in `tile_metadata_store.md` § Test Cases (`schema-shape-fixture-diff`) — but the contract document of record stays the Protocol contract. diff --git a/_docs/03_implementation/batch_27_cycle1_report.md b/_docs/03_implementation/batch_27_cycle1_report.md new file mode 100644 index 0000000..bfb052a --- /dev/null +++ b/_docs/03_implementation/batch_27_cycle1_report.md @@ -0,0 +1,236 @@ +# Batch 27 / Cycle 1 — Implementation Report + +**Date**: 2026-05-12 +**Tasks**: AZ-304 (C6 Postgres schema — additive 0002 migration: identity, LRU, freshness rules) +**Story points landed**: 3 +**Status**: complete (AZ-304 → In Testing) + +## Scope summary + +Single-task batch — final queued 1-pointer of cycle 1. AZ-304 ships the +strictly-additive `0002_c6_tile_identity_and_lru.py` migration on top of +the AZ-263 baseline, the pinned UUIDv5 namespace + derivation helpers +cross-coordinated with `satellite-provider`, the migration runner the +composition root invokes at startup, and bumps the AZ-303 +`TileMetadataStore` contract v1.1.0 → v1.2.0 by adding the +`TileMetadata.location_hash: UUID | None` field. + +Before implementation, the spec was rewritten under user-approved +Option A to drop the renames / retypes / drops the original draft +called for. The as-built migration only adds columns, indices, the +`tile_freshness_rules` table, and widens (loosens) the +`ck_tiles_freshness_status` CHECK to the UNION of AZ-263 and AZ-303 +vocabularies. No AZ-263 column, table, index, or CHECK is renamed, +retyped, or dropped (data_model.md § 6.1 / § 6.3). + +## Files added / modified + +### New (production) + +- `src/gps_denied_onboard/components/c6_tile_cache/_uuid_namespace.py` — + pinned `TILE_NAMESPACE_UUID = 5b8d0c2e-1a4f-4b3a-8c9d-e7f6a3b2c1d0`, + `derive_tile_id(zoom, x, y, source, flight_id)` (UUIDv5 over + `"{z}/{x}/{y}/{source}/{flight_id_or_zero_uuid}"`), + `derive_location_hash(z, x, y)`. Cross-workspace coordination point + with `satellite-provider/SatelliteProvider.Common/Utils/Uuidv5.cs`. +- `src/gps_denied_onboard/components/c6_tile_cache/migrations.py` — + `apply_migrations(config) -> MigrationResult` runner, frozen + `MigrationResult` dataclass, `MigrationError`. Drives Alembic + `command.upgrade(cfg, "head")` against the AZ-263 env at + `db/migrations/`; one retry on PG SQLSTATE `40001` + (serialization failure); structured INFO log on apply / no-op with + the resolved namespace UUID emitted for post-mortem drift detection. +- `db/migrations/versions/0002_c6_tile_identity_and_lru.py` — additive + Alembic migration: six new `tiles` columns (`tile_uuid` UNIQUE, + `location_hash`, `content_sha256` w/ length-64 CHECK, `disk_bytes` + w/ nonneg CHECK, `accessed_at`, `uploaded_at`); four new btree + indices on `tiles`; one UNIQUE expression index + `idx_tiles_natural_key` over the COALESCE-zero-uuid natural key; + CHECK-widening of `ck_tiles_freshness_status` to the AZ-263 + + AZ-303 UNION vocabulary; four NULLable bbox columns on + `sector_classifications`; new `tile_freshness_rules` table seeded + with the two default thresholds (active_conflict / reject @ 6m, + stable_rear / downgrade @ 12m). Reverse `downgrade()` drops every + addition and restores the AZ-263 freshness CHECK to its original + vocabulary. + +### Modified (production) + +- `src/gps_denied_onboard/components/c6_tile_cache/_types.py` — + `TileMetadata.location_hash: UUID | None = None` (positional last, + default `None` so AZ-303 v1.1.0 constructors keep working; the + impl-task's `PostgresFilesystemStore.insert_metadata` derives the + value when `None` and the DB NOT-NULL constraint is the safety + net). +- `db/migrations/env.py` — docstring tightened to document the + `target_metadata` deferral (we use Alembic `op.*` directly; ORM + models land in a later cycle). + +### New (tests) + +- `tests/unit/c6_tile_cache/test_uuid_namespace.py` — 8 deterministic + Tier-1 tests covering AC-10 (5 locked tile-id vectors, 2 locked + location-hash vectors, idempotent calls, `TileSource` enum / + `UUID`-typed / `None`-flight inputs, malformed-flight rejection) + and AC-11 (location-hash invariance across source/flight, distinct + cells distinct hashes). +- `tests/unit/c6_tile_cache/test_postgres_schema.py` — `@pytest.mark. + docker` integration tests covering AC-1 through AC-9 + NFR-perf- + apply / NFR-perf-noop + `MigrationResult` frozen smoke. Reads + `DB_URL` env var; consumes the `db` service from `docker-compose. + test.yml`. Auto-skipped on Tier-1 by `tests/conftest.py`. +- `tests/unit/c6_tile_cache/test_migrations_runner.py` — 5 Tier-1 + tests covering DSN resolution (config-block, env fallback, missing + raises `MigrationError`) and the SQLSTATE-40001 retry / terminal + paths (mocked DBAPIError to keep them deterministic and DB-less). +- `tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql` — + copy-pastable DDL fixture the AC-3 schema-shape diff test diffs + against. Includes every AZ-263 column / index / CHECK plus every + additive item from this migration plus the two seeded + `tile_freshness_rules` rows. + +### Modified (tests + docs) + +- `tests/unit/test_ac5_alembic.py` — head-revision assertion bumped + from `0001_initial` to `0002_c6_tile_identity_and_lru`; function + renamed `test_head_revision_is_0001_initial` → + `test_head_revision_matches_latest_migration`. +- `_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md` + — version bumped v1.1.0 → v1.2.0; new `TileMetadata.location_hash` + section; `sector_boundaries` → `sector_classifications` + corrections; Alembic migration path corrected from + `c6_tile_cache/_alembic/` to `db/migrations/versions/`; Change Log + v1.2.0 entry. +- `_docs/02_document/module-layout.md` — `c6_tile_cache` component + now explicitly `Owns` `db/migrations/**` (Alembic env was already + living there since AZ-263 bootstrap, but not documented). + `migrations.py` / `_uuid_namespace.py` listed under the component's + internal modules. +- `_docs/02_tasks/todo/AZ-304_c6_postgres_schema.md` → moved to + `_docs/02_tasks/done/`. Spec was rewritten earlier this batch + (Option A: strictly-additive 0002 + descope of + `psycopg_pool`/`connection.py` to AZ-305 + replacement of + `testcontainers` with the existing `@pytest.mark.docker` / + `docker-compose.test.yml` infra). + +## Design decisions + +1. **Strictly additive 0002, no rename / retype / drop.** User picked + Option A on the schema-reconciliation gate. Every AZ-263 column, + table, index, and CHECK survives 0002 byte-identical. The single + constraint mutation is the `ck_tiles_freshness_status` CHECK + widening, which is a semantic loosening (more values accepted) and + therefore additive per `data_model.md` § 6.1. Legacy AZ-263 values + (`stale_warn`, `stale_reject`) remain valid until an ADR-gated + future cycle deprecates them. DTO ↔ column mapping + (`freshness_label` ↔ `freshness_status`, `quality_metadata` ↔ + `tile_quality_metadata`, `(lat, lon)` ↔ `(latitude, longitude)`) + is owned by the future `PostgresFilesystemStore` task (AZ-305) — + the schema does NOT pivot to DTO field names. + +2. **UUIDv5 namespace pinned in code, not config.** `TILE_NAMESPACE_ + UUID = 5b8d0c2e-1a4f-4b3a-8c9d-e7f6a3b2c1d0` lives in the + `_uuid_namespace` module as a `Final` constant — not in + environment / config / DB — so cross-workspace coordination with + the C# `satellite-provider` reduces to a code-level review check + on either side. The runner logs the namespace value on every + apply / no-op so production drift would surface immediately. + +3. **`psycopg_pool` deferred to AZ-305.** The original spec assumed + AZ-263 pinned `psycopg_pool`; reality check showed only + `psycopg[binary]>=3.1` and `alembic>=1.13` are pinned. A new + third-party dependency would inflate this task's scope and risk; + moved to AZ-305 (`c6_postgres_filesystem_store`) where the + runtime connection pool is genuinely needed. The migration runner + uses Alembic's existing `NullPool` SQLAlchemy engine wired by + AZ-263 — no pool needed for one-shot startup migrations. + +4. **Retry-without-sleep on serialization failure.** The runner + retries once on PG SQLSTATE `40001`, but without a `time.sleep` + backoff. Component files are forbidden from calling `time.*` + directly (`tests/_meta/test_no_direct_time_in_components.py`, + Invariant 2 of the replay contract), and migrations run before the + injected `Clock` is constructed, so `Clock.sleep` is unavailable + too. Alembic's `NullPool` opens a fresh connection on retry, which + already introduces natural jitter; a 0-50 ms backoff buys nothing + measurable. Documented in `migrations.py` inline. + +5. **`MigrationError` NOT a member of `TileCacheError` family.** + Migrations run BEFORE the runtime error consumer (`c6_tile_cache. + errors`) is constructed — making `MigrationError` a `TileCacheError` + would create a forward-import cycle and conceptually mis-place a + startup-phase failure inside a runtime-phase error tree. Documented + in the runner module docstring + verified by + `test_migration_error_is_not_tile_cache_error`. + +6. **`module-layout.md` updated as scope-creep, user-approved.** Drift + gate surfaced that `module-layout.md` referenced a `_alembic/` + directory and `.sql` migration files; reality is `db/migrations/` + `.py` files and the doc didn't claim ownership of any of it. User + picked Option A to expand AZ-304 scope by one doc edit: + `c6_tile_cache` now explicitly `Owns` `db/migrations/**`. + +## AC coverage + +| AC | Test name(s) | Status | +|----|--------------|--------| +| AC-1 | `test_ac1_apply_to_az263_baseline_advances_alembic_version` | Tier-2 (docker) | +| AC-2 | `test_ac2_apply_at_head_is_no_op` | Tier-2 (docker) | +| AC-3 | `test_ac3_schema_shape_diffs_clean_against_fixture` | Tier-2 (docker) | +| AC-4 | `test_ac4_per_flight_separation_allowed` | Tier-2 (docker) | +| AC-4b | `test_ac4b_duplicate_natural_key_rejected` + `..._googlemaps` | Tier-2 (docker) | +| AC-5 | `test_ac5_freshness_check_widening_accepts_union_vocabulary` | Tier-2 (docker) | +| AC-6 | `test_ac6_downgrade_reverses_cleanly` | Tier-2 (docker) | +| AC-7 | `test_ac7_default_freshness_rules_seeded` | Tier-2 (docker) | +| AC-8 | `test_ac8_runner_logs_info_on_apply_and_no_op` | Tier-2 (docker) | +| AC-9 | `test_ac9_az263_columns_byte_identical_after_upgrade` | Tier-2 (docker) | +| AC-10 | `test_ac10_namespace_locked_and_locked_vectors_match` + companions | Tier-1 passing | +| AC-11 | `test_ac11_location_hash_invariant_across_source_and_flight_id` + companion | Tier-1 passing | +| AC-12 | covered by AZ-303 `test_protocol_conformance.py` (DTO default still `None`) | Tier-1 passing | +| NFR-perf-apply | `test_nfr_perf_apply_under_5s_on_baselined_empty_db` | Tier-2 (docker) | +| NFR-perf-noop | `test_nfr_perf_noop_under_100ms` | Tier-2 (docker) | +| NFR-reliability-retry | `test_nfr_reliability_retry_once_on_serialization_failure` + companion | Tier-1 passing | + +The Tier-2 docker tests are explicitly skipped on Tier-1 by the +project-wide `tests/conftest.py` mark-filter; they run locally via +`docker compose -f docker-compose.test.yml up -d db && GPS_DENIED_TIER=2 +DB_URL=postgresql://gps_denied:dev@localhost:5432/gps_denied pytest +tests/unit/c6_tile_cache/test_postgres_schema.py`. + +## Test run + +`python3 -m pytest tests/ -q` → **1180 passed, 32 skipped (Tier-2 / +CUDA / cmake / actionlint / docker), 1 warning** in 21.77 s, no +failures. + +Mypy strict on production + test modules: clean. +Ruff + ruff format on the modified set: clean. + +## Self-review + +- Production code: deterministic UUIDv5 with locked namespace + + documented cross-repo invariants; no `time.*` calls in component + files (Invariant 2 meta-test passes); no new third-party deps; + migration is byte-additive on the AZ-263 baseline; runner emits + structured INFO logs with `kind` + `kv` per the AZ-266 log schema; + `MigrationError` correctly outside the `TileCacheError` family. +- Tests: every AC has at least one named assertion (Tier-1 or Tier-2); + cross-repo coordination vectors are locked with documented expected + UUIDs; no hidden runtime dependencies in Tier-1 (mocked DBAPIError + for the retry path). +- Lint / type: ruff + mypy strict clean on the modified set. +- Docs: AZ-304 spec moved to `done/`; contract bumped v1.2.0; + `module-layout.md` reconciled with reality. + +## Known gaps + +- Tier-2 Postgres tests need a real `db` service; rolled forward to + the Tier-2 validation pass. +- `psycopg_pool` connection helper deferred to AZ-305 — documented in + both this report and the AZ-304 spec. +- `target_metadata` wiring in `db/migrations/env.py` deferred until + the first task introduces SQLAlchemy ORM models (none today, + none in AZ-305 as currently scoped). +- AZ-263 legacy CHECK values (`stale_warn`, `stale_reject`) remain + valid in `ck_tiles_freshness_status`; deprecation requires a future + ADR-gated cleanup migration. diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index 1ec74d7..836799f 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -8,7 +8,7 @@ status: in_progress sub_step: phase: 3 name: compute-batch - detail: "batch 27/cycle1: AZ-304 spec rewritten per Option A (strict-additive 0002 on AZ-263)" + detail: "batch 27/cycle1: AZ-304 complete (additive 0002 migration + UUIDv5 namespace + runner + v1.2.0 contract bump). Awaiting next batch selection." retry_count: 0 cycle: 1 tracker: jira diff --git a/db/migrations/env.py b/db/migrations/env.py index 016dca1..afabc17 100644 --- a/db/migrations/env.py +++ b/db/migrations/env.py @@ -1,7 +1,11 @@ """Alembic env. Bootstrap (AZ-263) ships the minimal `env.py` so `alembic check` resolves -`0001_initial.sql` as head. Concrete metadata wiring is added by AZ-304. +`0001_initial` as head. AZ-304 keeps this env minimal: no +``target_metadata`` is wired because every c6_tile_cache migration uses +the imperative ``alembic.op.*`` surface (never ``autogenerate``). A +``target_metadata`` wiring lands the first cycle that adds SQLAlchemy +ORM models, not before. """ from __future__ import annotations diff --git a/db/migrations/versions/0002_c6_tile_identity_and_lru.py b/db/migrations/versions/0002_c6_tile_identity_and_lru.py new file mode 100644 index 0000000..694dbc6 --- /dev/null +++ b/db/migrations/versions/0002_c6_tile_identity_and_lru.py @@ -0,0 +1,228 @@ +"""C6 tile identity + LRU + freshness rules — strictly additive on 0001_initial. + +Per ``_docs/02_tasks/todo/AZ-304_c6_postgres_schema.md`` and +``_docs/02_document/data_model.md`` §§ 6.1 / 6.3. The migration: + +- adds the deterministic-identity columns (``tile_uuid``, ``location_hash``); +- adds the content-hash chain column (``content_sha256``); +- adds the LRU + disk-budget bookkeeping columns (``disk_bytes``, + ``accessed_at``, ``uploaded_at``); +- adds the per-flight natural-key UNIQUE expression index + (``idx_tiles_natural_key``) over the COALESCE-zero-uuid form so two + ``onboard_ingest`` rows for the same cell from different flights + coexist while two ``googlemaps`` rows (both flight_id NULL) cannot; +- adds four new btree indices on ``tiles`` (location_hash, accessed_at, + pending_upload partial, flight_captured partial); +- widens the ``tiles.freshness_status`` CHECK to accept the UNION of the + AZ-263 legacy vocabulary AND the AZ-303 ``FreshnessLabel`` vocabulary; +- adds four NULLable bbox columns to ``sector_classifications``; +- creates the new ``tile_freshness_rules`` table seeded with the + per-classification thresholds (6 months active_conflict / reject, + 12 months stable_rear / downgrade). + +The migration assumes ``tiles`` is empty at apply time (greenfield); +NOT-NULL additive columns without server defaults rely on that. + +Reverse ``downgrade()`` drops every addition and restores the AZ-263 +``freshness_status`` CHECK to its original ``('fresh','stale_warn','stale_reject')`` +vocabulary. Downgrade is destructive for any rows holding the new column +values and is documented operator-only. + +Revision ID: 0002_c6_tile_identity_and_lru +Revises: 0001_initial +Create Date: 2026-05-12 +""" + +from __future__ import annotations + +from collections.abc import Sequence +from datetime import datetime, timezone + +import sqlalchemy as sa +from alembic import op + +revision: str = "0002_c6_tile_identity_and_lru" +down_revision: str | None = "0001_initial" +branch_labels: str | Sequence[str] | None = None +depends_on: str | Sequence[str] | None = None + + +_NATURAL_KEY_INDEX_SQL = """ +CREATE UNIQUE INDEX idx_tiles_natural_key ON tiles ( + zoom_level, + tile_x, + tile_y, + tile_size_meters, + source, + COALESCE(flight_id, '00000000-0000-0000-0000-000000000000'::uuid) +) +""" + +_FRESHNESS_STATUS_LEGACY = "freshness_status IN ('fresh','stale_warn','stale_reject')" +_FRESHNESS_STATUS_UNION = ( + "freshness_status IN (" + "'fresh','stale_warn','stale_reject'," + "'stale_active_conflict','stale_rear','downgraded'" + ")" +) + + +def upgrade() -> None: + # tiles -- additive columns ------------------------------------------------ + op.add_column( + "tiles", + sa.Column( + "tile_uuid", + sa.dialects.postgresql.UUID(as_uuid=True), + nullable=False, + unique=True, + ), + ) + op.add_column( + "tiles", + sa.Column( + "location_hash", + sa.dialects.postgresql.UUID(as_uuid=True), + nullable=False, + ), + ) + op.add_column( + "tiles", + sa.Column("content_sha256", sa.Text(), nullable=False), + ) + op.create_check_constraint( + "ck_tiles_content_sha256_len", + "tiles", + "length(content_sha256) = 64", + ) + op.add_column( + "tiles", + sa.Column("disk_bytes", sa.BigInteger(), nullable=False), + ) + op.create_check_constraint( + "ck_tiles_disk_bytes_nonneg", + "tiles", + "disk_bytes >= 0", + ) + op.add_column( + "tiles", + sa.Column( + "accessed_at", + sa.DateTime(timezone=True), + nullable=False, + server_default=sa.text("now()"), + ), + ) + op.add_column( + "tiles", + sa.Column("uploaded_at", sa.DateTime(timezone=True), nullable=True), + ) + + # tiles -- freshness_status CHECK widening (drop + recreate) --------------- + op.drop_constraint("ck_tiles_freshness_status", "tiles", type_="check") + op.create_check_constraint( + "ck_tiles_freshness_status", + "tiles", + _FRESHNESS_STATUS_UNION, + ) + + # tiles -- new indices ----------------------------------------------------- + op.execute(_NATURAL_KEY_INDEX_SQL) + op.create_index("idx_tiles_location_hash", "tiles", ["location_hash"]) + op.create_index("idx_tiles_accessed_at", "tiles", ["accessed_at"]) + op.create_index( + "idx_tiles_pending_upload", + "tiles", + ["uploaded_at"], + postgresql_where=sa.text("source = 'onboard_ingest' AND uploaded_at IS NULL"), + ) + op.create_index( + "idx_tiles_flight_captured", + "tiles", + ["flight_id", "capture_timestamp"], + postgresql_where=sa.text("flight_id IS NOT NULL"), + ) + + # sector_classifications -- nullable bbox additions ------------------------ + op.add_column( + "sector_classifications", + sa.Column("min_lat", sa.Float(precision=53), nullable=True), + ) + op.add_column( + "sector_classifications", + sa.Column("min_lon", sa.Float(precision=53), nullable=True), + ) + op.add_column( + "sector_classifications", + sa.Column("max_lat", sa.Float(precision=53), nullable=True), + ) + op.add_column( + "sector_classifications", + sa.Column("max_lon", sa.Float(precision=53), nullable=True), + ) + + # tile_freshness_rules -- new table with seed rows ------------------------- + rules_table = op.create_table( + "tile_freshness_rules", + sa.Column("classification", sa.Text(), primary_key=True), + sa.Column("max_age_seconds", sa.BigInteger(), nullable=False), + sa.Column("action", sa.Text(), nullable=False), + sa.Column( + "set_at", + sa.DateTime(timezone=True), + nullable=False, + server_default=sa.text("now()"), + ), + sa.CheckConstraint("action IN ('reject','downgrade')", name="ck_tfr_action"), + sa.CheckConstraint("max_age_seconds > 0", name="ck_tfr_max_age_pos"), + ) + + seed_at = datetime(2026, 5, 12, tzinfo=timezone.utc) + op.bulk_insert( + rules_table, + [ + { + "classification": "active_conflict", + "max_age_seconds": 6 * 30 * 86400, # 6 months ≈ 15552000 s + "action": "reject", + "set_at": seed_at, + }, + { + "classification": "stable_rear", + "max_age_seconds": 12 * 30 * 86400, # 12 months ≈ 31104000 s + "action": "downgrade", + "set_at": seed_at, + }, + ], + ) + + +def downgrade() -> None: + op.drop_table("tile_freshness_rules") + + op.drop_column("sector_classifications", "max_lon") + op.drop_column("sector_classifications", "max_lat") + op.drop_column("sector_classifications", "min_lon") + op.drop_column("sector_classifications", "min_lat") + + op.drop_index("idx_tiles_flight_captured", table_name="tiles") + op.drop_index("idx_tiles_pending_upload", table_name="tiles") + op.drop_index("idx_tiles_accessed_at", table_name="tiles") + op.drop_index("idx_tiles_location_hash", table_name="tiles") + op.execute("DROP INDEX IF EXISTS idx_tiles_natural_key") + + op.drop_constraint("ck_tiles_freshness_status", "tiles", type_="check") + op.create_check_constraint( + "ck_tiles_freshness_status", + "tiles", + _FRESHNESS_STATUS_LEGACY, + ) + + op.drop_column("tiles", "uploaded_at") + op.drop_column("tiles", "accessed_at") + op.drop_constraint("ck_tiles_disk_bytes_nonneg", "tiles", type_="check") + op.drop_column("tiles", "disk_bytes") + op.drop_constraint("ck_tiles_content_sha256_len", "tiles", type_="check") + op.drop_column("tiles", "content_sha256") + op.drop_column("tiles", "location_hash") + op.drop_column("tiles", "tile_uuid") diff --git a/src/gps_denied_onboard/components/c6_tile_cache/_types.py b/src/gps_denied_onboard/components/c6_tile_cache/_types.py index 9765cbd..e85976c 100644 --- a/src/gps_denied_onboard/components/c6_tile_cache/_types.py +++ b/src/gps_denied_onboard/components/c6_tile_cache/_types.py @@ -17,6 +17,7 @@ from dataclasses import dataclass from datetime import datetime from enum import Enum from pathlib import Path +from uuid import UUID __all__ = [ "Bbox", @@ -181,6 +182,12 @@ class TileMetadata: companion_id: str | None quality_metadata: TileQualityMetadata | None voting_status: VotingStatus + # AZ-304: deterministic per-cell-bag identifier (`uuidv5` over + # ``(zoom, tile_x, tile_y)`` from :mod:`_uuid_namespace`). Defaults to + # ``None`` so existing AZ-303-era constructors keep working; the + # ``PostgresFilesystemStore`` derives the value at insert time when + # ``None`` and the DB NOT-NULL constraint is the safety net. + location_hash: UUID | None = None @dataclass(frozen=True) diff --git a/src/gps_denied_onboard/components/c6_tile_cache/_uuid_namespace.py b/src/gps_denied_onboard/components/c6_tile_cache/_uuid_namespace.py new file mode 100644 index 0000000..c43f583 --- /dev/null +++ b/src/gps_denied_onboard/components/c6_tile_cache/_uuid_namespace.py @@ -0,0 +1,101 @@ +"""C6 tile-cache UUIDv5 namespace + derivation helpers (AZ-304). + +This module is the *cross-repo coordination point* between +``gps-denied-onboard`` (Python) and ``satellite-provider`` (C#) for the +deterministic per-tile and per-cell-bag identifiers used in: + +- the ``tiles.tile_uuid`` column (per-``(zoom, tile_x, tile_y, source, flight_id)`` + identity); +- the ``tiles.location_hash`` column (per-``(zoom, tile_x, tile_y)`` cell-bag + identifier shared across sources and flights — Scenario 1 UI lookup, + Scenario 6 voting query of the 2026-05-12 tile-schema scenario analysis). + +The pinned :data:`TILE_NAMESPACE_UUID` constant MUST be byte-identical to +the corresponding C# constant in +``satellite-provider/SatelliteProvider.Common/Utils/Uuidv5.cs`` per +``AZ-TBD_tile_identity_uuidv5_bulk_list``. Changing this value invalidates +every existing tile identifier on both sides and requires a coordinated +cross-repo release. + +Name format passed into :func:`uuid.uuid5`: + +- :func:`derive_tile_id`: + ``"{zoom_level}/{tile_x}/{tile_y}/{source}/{flight_id}"`` with + ``flight_id`` rendered as the canonical 8-4-4-4-12 lowercase UUID + string. ``None`` collapses to ``"00000000-0000-0000-0000-000000000000"`` + so per-source googlemaps tiles (no flight) yield a single deterministic + identity per cell + source. +- :func:`derive_location_hash`: ``"{zoom_level}/{tile_x}/{tile_y}"`` — no + source, no flight; shared across all rows for the cell. + +The ``name`` is UTF-8 encoded inside :func:`uuid.uuid5` (CPython +``uuid.py`` ``_uuid_generate_random`` path). The satellite-provider C# +implementation MUST use the same UTF-8 encoding for the locked test +vectors in ``tests/unit/c6_tile_cache/test_uuid_namespace.py`` to match +byte-for-byte. +""" + +from __future__ import annotations + +from enum import Enum +from uuid import UUID, uuid5 + +TILE_NAMESPACE_UUID: UUID = UUID("5b8d0c2e-1a4f-4b3a-8c9d-e7f6a3b2c1d0") +"""Pinned cross-repo UUIDv5 namespace; DO NOT regenerate (AZ-304 § Constraints).""" + +_ZERO_UUID: UUID = UUID("00000000-0000-0000-0000-000000000000") + + +def _normalize_source(source: object) -> str: + if isinstance(source, Enum): + value = source.value + else: + value = source + if not isinstance(value, str): + raise TypeError( + "derive_tile_id: source must be a str-Enum (TileSource) or str; " + f"got {type(source).__name__}" + ) + return value + + +def _normalize_flight_id(flight_id: object) -> str: + if flight_id is None: + return str(_ZERO_UUID) + if isinstance(flight_id, UUID): + return str(flight_id) + if isinstance(flight_id, str): + return str(UUID(flight_id)) + raise TypeError( + f"derive_tile_id: flight_id must be UUID, str, or None; got {type(flight_id).__name__}" + ) + + +def derive_tile_id( + zoom_level: int, + tile_x: int, + tile_y: int, + source: object, + flight_id: object, +) -> UUID: + """Compute the deterministic per-row ``tile_uuid``. + + See module docstring for the name format and cross-repo invariants. + """ + source_str = _normalize_source(source) + flight_str = _normalize_flight_id(flight_id) + name = f"{int(zoom_level)}/{int(tile_x)}/{int(tile_y)}/{source_str}/{flight_str}" + return uuid5(TILE_NAMESPACE_UUID, name) + + +def derive_location_hash(zoom_level: int, tile_x: int, tile_y: int) -> UUID: + """Compute the per-cell-bag ``location_hash`` (AC-11 invariant).""" + name = f"{int(zoom_level)}/{int(tile_x)}/{int(tile_y)}" + return uuid5(TILE_NAMESPACE_UUID, name) + + +__all__ = [ + "TILE_NAMESPACE_UUID", + "derive_location_hash", + "derive_tile_id", +] diff --git a/src/gps_denied_onboard/components/c6_tile_cache/migrations.py b/src/gps_denied_onboard/components/c6_tile_cache/migrations.py new file mode 100644 index 0000000..fb70050 --- /dev/null +++ b/src/gps_denied_onboard/components/c6_tile_cache/migrations.py @@ -0,0 +1,236 @@ +"""C6 tile-cache Alembic migration runner (AZ-304). + +Public surface: + +- :class:`MigrationResult` — frozen dataclass describing the outcome of a + single :func:`apply_migrations` invocation. +- :func:`apply_migrations` — invoked by the composition root at startup + AFTER config load and BEFORE ``PostgresFilesystemStore`` construction. +- :class:`MigrationError` — raised on terminal migration failure; + deliberately NOT a member of the :class:`TileCacheError` family + because migrations run before any runtime error consumer is wired. + +Implementation notes: + +- The runner drives Alembic's ``command.upgrade(cfg, "head")`` against the + project-pinned env at ``db/migrations/`` (AZ-263 bootstrap). The env + already wires ``engine_from_config(..., poolclass=pool.NullPool)`` so + no separate connection pool is needed at apply time. +- Connection resolution: prefer ``config.c6_tile_cache.postgres_dsn`` if + non-empty; else fall back to the ``DB_URL`` env var (matches the + fallback in ``db/migrations/env.py``). +- One transient retry on serialization failures (PG SQLSTATE ``40001``) — + more conservative than the AZ-263 bootstrap because migrations are a + one-shot startup step where a second loud failure is the right signal. +""" + +from __future__ import annotations + +import os +from dataclasses import dataclass +from pathlib import Path +from typing import TYPE_CHECKING + +from alembic import command +from alembic.config import Config as AlembicConfig +from alembic.runtime.migration import MigrationContext +from alembic.script import ScriptDirectory +from sqlalchemy import create_engine +from sqlalchemy.exc import DBAPIError, OperationalError + +from gps_denied_onboard.components.c6_tile_cache._uuid_namespace import ( + TILE_NAMESPACE_UUID, +) +from gps_denied_onboard.logging.structured import get_logger + +if TYPE_CHECKING: + from gps_denied_onboard.config.schema import Config + + +_LOGGER = get_logger("c6_tile_cache.migrations") +_PROJECT_ROOT = Path(__file__).resolve().parents[4] +_ALEMBIC_INI = _PROJECT_ROOT / "alembic.ini" +_ALEMBIC_SCRIPT_LOCATION = _PROJECT_ROOT / "db" / "migrations" + + +class MigrationError(RuntimeError): + """Terminal failure applying c6_tile_cache Alembic migrations. + + NOT a member of the ``TileCacheError`` family by design: migrations + run during composition-root startup, before any runtime error + consumer (`c6_tile_cache.errors.TileCacheError`) exists. + """ + + +@dataclass(frozen=True) +class MigrationResult: + """Outcome of one :func:`apply_migrations` invocation.""" + + applied: list[str] + current_revision: str + no_op: bool + + +def _resolve_dsn(config: Config) -> str: + block = config.components.get("c6_tile_cache") + dsn = getattr(block, "postgres_dsn", "") if block is not None else "" + if dsn: + return dsn + env_dsn = os.environ.get("DB_URL") + if env_dsn: + return env_dsn + raise MigrationError( + "c6_tile_cache.apply_migrations: no DSN available — " + "set config.components['c6_tile_cache'].postgres_dsn or " + "the DB_URL environment variable" + ) + + +def _to_sqlalchemy_url(raw_dsn: str) -> str: + """Normalise ``postgresql://`` → ``postgresql+psycopg://`` (psycopg3 driver). + + Matches the same transformation ``db/migrations/env.py`` applies on + the runtime path; doing it here as well so the pre-flight current-rev + probe uses the same driver. + """ + if raw_dsn.startswith("postgresql://"): + return raw_dsn.replace("postgresql://", "postgresql+psycopg://", 1) + return raw_dsn + + +def _alembic_config(sqlalchemy_url: str) -> AlembicConfig: + cfg = AlembicConfig(str(_ALEMBIC_INI)) + cfg.set_main_option("script_location", str(_ALEMBIC_SCRIPT_LOCATION)) + cfg.set_main_option("sqlalchemy.url", sqlalchemy_url) + return cfg + + +def _current_revision(sqlalchemy_url: str) -> str | None: + engine = create_engine(sqlalchemy_url, poolclass=None) + try: + with engine.connect() as conn: + context = MigrationContext.configure(conn) + return context.get_current_revision() + finally: + engine.dispose() + + +def _head_revision(cfg: AlembicConfig) -> str: + script = ScriptDirectory.from_config(cfg) + head = script.get_current_head() + if head is None: + raise MigrationError( + "c6_tile_cache.apply_migrations: Alembic script directory has no head; " + f"check {_ALEMBIC_SCRIPT_LOCATION}" + ) + return head + + +def _is_serialization_failure(exc: BaseException) -> bool: + """SQLSTATE 40001 — PG serialization conflict. + + SQLAlchemy wraps psycopg's ``SerializationFailure`` inside + :class:`sqlalchemy.exc.DBAPIError`; we duck-type rather than import + ``psycopg.errors`` to keep this module independent of the underlying + driver package layout. + """ + if isinstance(exc, DBAPIError): + sqlstate = getattr(getattr(exc, "orig", None), "sqlstate", None) + return sqlstate == "40001" + return getattr(exc, "sqlstate", None) == "40001" + + +def apply_migrations(config: Config) -> MigrationResult: + """Apply pending c6_tile_cache Alembic migrations against the configured DB. + + Returns a :class:`MigrationResult` describing what was applied. Logs + an INFO record on every revision applied; logs an INFO ``no_op`` + record when the DB is already at head. Retries once on a PG + serialization failure (SQLSTATE 40001); raises :class:`MigrationError` + on the second failure or any other terminal error. + + The :data:`TILE_NAMESPACE_UUID` value is included in the structured + log payload on every apply / no-op so post-mortem drift detection is + trivial against the satellite-provider C# constant. + """ + raw_dsn = _resolve_dsn(config) + sqlalchemy_url = _to_sqlalchemy_url(raw_dsn) + cfg = _alembic_config(sqlalchemy_url) + head_rev = _head_revision(cfg) + + last_exc: BaseException | None = None + for attempt in (0, 1): + try: + pre_rev = _current_revision(sqlalchemy_url) + if pre_rev == head_rev: + _LOGGER.info( + "c6.migrations: no-op (already at head)", + extra={ + "kind": "c6.migration.no_op", + "kv": { + "current_revision": head_rev, + "namespace_uuid": str(TILE_NAMESPACE_UUID), + }, + }, + ) + return MigrationResult(applied=[], current_revision=head_rev, no_op=True) + + command.upgrade(cfg, "head") + post_rev = _current_revision(sqlalchemy_url) or head_rev + applied = _resolve_applied(cfg, pre_rev, post_rev) + _LOGGER.info( + "c6.migrations: applied", + extra={ + "kind": "c6.migration.applied", + "kv": { + "revisions": applied, + "from_revision": pre_rev, + "to_revision": post_rev, + "namespace_uuid": str(TILE_NAMESPACE_UUID), + }, + }, + ) + return MigrationResult(applied=applied, current_revision=post_rev, no_op=False) + except (DBAPIError, OperationalError) as exc: + last_exc = exc + if _is_serialization_failure(exc) and attempt == 0: + # PG SQLSTATE 40001 — serialization conflict. Retry immediately: + # migrations run as a one-shot startup step before the injected + # Clock is constructed, so we cannot use `Clock.sleep`. Components + # are forbidden from using `time.sleep` directly (replay-determinism + # Invariant 2 / `tests/_meta/test_no_direct_time_in_components.py`), + # and a 0-50 ms backoff buys nothing here in practice: Alembic's + # NullPool starts a fresh connection on retry which already + # introduces enough natural jitter. + continue + raise MigrationError(f"c6_tile_cache.apply_migrations: database error: {exc}") from exc + except Exception as exc: + raise MigrationError(f"c6_tile_cache.apply_migrations: terminal error: {exc}") from exc + + # Defensive: loop must either return or raise; if we ever fall out, surface it. + raise MigrationError( + f"c6_tile_cache.apply_migrations: exhausted retry attempts (last_exc={last_exc!r})" + ) + + +def _resolve_applied(cfg: AlembicConfig, pre_rev: str | None, post_rev: str) -> list[str]: + """Enumerate revisions between ``pre_rev`` (exclusive) and ``post_rev`` (inclusive).""" + if pre_rev == post_rev: + return [] + script = ScriptDirectory.from_config(cfg) + descending = [r.revision for r in script.walk_revisions(base="base", head=post_rev)] + if pre_rev is None: + return list(reversed(descending)) + try: + pre_idx = descending.index(pre_rev) + except ValueError: + # pre_rev is no longer in the chain (e.g., squashed) — return everything we walked. + return list(reversed(descending)) + return list(reversed(descending[:pre_idx])) + + +__all__ = [ + "MigrationError", + "MigrationResult", + "apply_migrations", +] diff --git a/tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql b/tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql new file mode 100644 index 0000000..cd052a2 --- /dev/null +++ b/tests/unit/c6_tile_cache/fixtures/c6_postgres_schema_v2.sql @@ -0,0 +1,156 @@ +-- AZ-304 — Expected post-0002 c6_tile_cache schema (Postgres 16). +-- +-- Human-readable canonical DDL for the schema state produced by applying +-- 0001_initial (AZ-263) + 0002_c6_tile_identity_and_lru (AZ-304). The +-- companion test `test_postgres_schema.py` introspects information_schema / +-- pg_indexes / pg_constraint / tile_freshness_rules row contents and +-- asserts each artifact below exists with the documented shape. +-- +-- Updating this file without a corresponding migration revision is a +-- Spec-Gap finding (High) at code review. + + +-- ===================================================================== +-- 0001_initial (AZ-263) — unchanged baseline +-- ===================================================================== + +CREATE TABLE flights ( + id UUID PRIMARY KEY, + companion_id TEXT NOT NULL, + started_at TIMESTAMPTZ NOT NULL DEFAULT now(), + landed_at TIMESTAMPTZ, + metadata JSONB +); + +CREATE TABLE sector_classifications ( + id BIGSERIAL PRIMARY KEY, + sector_id TEXT NOT NULL UNIQUE, + classification TEXT NOT NULL, + freshness_threshold_days INTEGER NOT NULL, + -- 0002 additive bbox columns (operator-populated pre-flight; NULL pre-population): + min_lat DOUBLE PRECISION, + min_lon DOUBLE PRECISION, + max_lat DOUBLE PRECISION, + max_lon DOUBLE PRECISION +); + +CREATE TABLE tiles ( + id BIGSERIAL PRIMARY KEY, + -- Canonical columns mirrored from satellite-provider (AZ-263): + zoom_level INTEGER NOT NULL, + tile_x INTEGER NOT NULL, + tile_y INTEGER NOT NULL, + latitude DOUBLE PRECISION NOT NULL, + longitude DOUBLE PRECISION NOT NULL, + tile_size_meters DOUBLE PRECISION NOT NULL, + tile_size_pixels INTEGER NOT NULL, + capture_timestamp TIMESTAMPTZ NOT NULL, + compression TEXT NOT NULL DEFAULT 'jpeg', + crs TEXT NOT NULL DEFAULT 'EPSG:3857', + source TEXT NOT NULL, + -- Additive onboard-only columns (AZ-263): + flight_id UUID REFERENCES flights(id), + companion_id TEXT, + tile_quality_metadata JSONB, + voting_status TEXT, + freshness_status TEXT NOT NULL DEFAULT 'fresh', + signature BYTEA, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), + -- 0002 additive identity / LRU / disk-budget columns: + tile_uuid UUID NOT NULL UNIQUE, + location_hash UUID NOT NULL, + content_sha256 TEXT NOT NULL, + disk_bytes BIGINT NOT NULL, + accessed_at TIMESTAMPTZ NOT NULL DEFAULT now(), + uploaded_at TIMESTAMPTZ, + -- AZ-263 CHECKs (preserved): + CONSTRAINT ck_tiles_zoom CHECK (zoom_level BETWEEN 10 AND 22), + CONSTRAINT ck_tiles_meters CHECK (tile_size_meters > 0), + CONSTRAINT ck_tiles_pixels CHECK (tile_size_pixels > 0), + CONSTRAINT ck_tiles_source CHECK (source IN ('googlemaps','onboard_ingest')), + CONSTRAINT ck_tiles_voting_status CHECK ( + voting_status IS NULL OR voting_status IN ('pending','trusted','rejected') + ), + -- 0002 widened CHECK (UNION of AZ-263 + AZ-303 vocabularies): + CONSTRAINT ck_tiles_freshness_status CHECK ( + freshness_status IN ( + 'fresh','stale_warn','stale_reject', + 'stale_active_conflict','stale_rear','downgraded' + ) + ), + -- 0002 additive CHECKs: + CONSTRAINT ck_tiles_content_sha256_len CHECK (length(content_sha256) = 64), + CONSTRAINT ck_tiles_disk_bytes_nonneg CHECK (disk_bytes >= 0) +); + +-- AZ-263 indices (preserved): +CREATE INDEX ix_tiles_zxy ON tiles (zoom_level, tile_x, tile_y); +CREATE INDEX ix_tiles_lat_lon ON tiles (latitude, longitude); +CREATE INDEX ix_tiles_voting_status_onboard + ON tiles (voting_status) + WHERE source = 'onboard_ingest'; +CREATE INDEX ix_tiles_flight_id ON tiles (flight_id); +CREATE INDEX ix_tiles_created_at ON tiles (created_at); + +-- 0002 additive indices: +CREATE UNIQUE INDEX idx_tiles_natural_key ON tiles ( + zoom_level, + tile_x, + tile_y, + tile_size_meters, + source, + COALESCE(flight_id, '00000000-0000-0000-0000-000000000000'::uuid) +); +CREATE INDEX idx_tiles_location_hash ON tiles (location_hash); +CREATE INDEX idx_tiles_accessed_at ON tiles (accessed_at); +CREATE INDEX idx_tiles_pending_upload ON tiles (uploaded_at) + WHERE source = 'onboard_ingest' AND uploaded_at IS NULL; +CREATE INDEX idx_tiles_flight_captured ON tiles (flight_id, capture_timestamp) + WHERE flight_id IS NOT NULL; + +-- Note: an automatic UNIQUE btree on `tiles.tile_uuid` is created by the +-- column-level UNIQUE; PG names it `tiles_tile_uuid_key` (system-generated). +-- The diff test does NOT assert that name; it asserts presence of a UNIQUE +-- index covering exactly `(tile_uuid)`. + + +CREATE TABLE manifests ( + id BIGSERIAL PRIMARY KEY, + manifest_id TEXT NOT NULL UNIQUE, + flight_id UUID NOT NULL REFERENCES flights(id), + content_hash TEXT NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + payload JSONB NOT NULL +); + +CREATE TABLE engine_cache_entries ( + id BIGSERIAL PRIMARY KEY, + engine_path TEXT NOT NULL, + sm_arch TEXT NOT NULL, + jetpack_version TEXT NOT NULL, + tensorrt_version TEXT NOT NULL, + precision TEXT NOT NULL, + content_hash TEXT NOT NULL UNIQUE, + int8_calibration_path TEXT, + created_at TIMESTAMPTZ NOT NULL DEFAULT now() +); + + +-- ===================================================================== +-- 0002_c6_tile_identity_and_lru (AZ-304) — new table +-- ===================================================================== + +CREATE TABLE tile_freshness_rules ( + classification TEXT PRIMARY KEY, + max_age_seconds BIGINT NOT NULL, + action TEXT NOT NULL, + set_at TIMESTAMPTZ NOT NULL DEFAULT now(), + CONSTRAINT ck_tfr_action CHECK (action IN ('reject','downgrade')), + CONSTRAINT ck_tfr_max_age_pos CHECK (max_age_seconds > 0) +); + +-- Seed rows applied by the 0002 migration (AC-7): +INSERT INTO tile_freshness_rules (classification, max_age_seconds, action) VALUES + ('active_conflict', 15552000, 'reject'), -- 6 months × 30 days × 86400 s + ('stable_rear', 31104000, 'downgrade'); -- 12 months × 30 days × 86400 s diff --git a/tests/unit/c6_tile_cache/test_migrations_runner.py b/tests/unit/c6_tile_cache/test_migrations_runner.py new file mode 100644 index 0000000..3866147 --- /dev/null +++ b/tests/unit/c6_tile_cache/test_migrations_runner.py @@ -0,0 +1,217 @@ +"""AZ-304 NFR-reliability-retry + unit tests for the c6_tile_cache migration runner. + +These tests stub :mod:`gps_denied_onboard.components.c6_tile_cache.migrations` +internals so the runner's retry / DSN-resolution / error-mapping paths can +be exercised without a real Postgres. The integration suite +(``test_postgres_schema.py``, ``@pytest.mark.docker``) covers the happy +path end-to-end. +""" + +from __future__ import annotations + +from typing import Any + +import pytest +from sqlalchemy.exc import DBAPIError + +from gps_denied_onboard.components.c6_tile_cache import migrations as migrations_module +from gps_denied_onboard.components.c6_tile_cache.config import C6TileCacheConfig +from gps_denied_onboard.components.c6_tile_cache.migrations import ( + MigrationError, + MigrationResult, + apply_migrations, +) +from gps_denied_onboard.config.schema import Config + + +def _config(dsn: str = "postgresql://user:pass@host:5432/db") -> Config: + return Config.with_blocks(c6_tile_cache=C6TileCacheConfig(postgres_dsn=dsn)) + + +class _FakeSerializationFailure(Exception): + """Lightweight stand-in for ``psycopg.errors.SerializationFailure``.""" + + sqlstate = "40001" + + +def _make_dbapi_error(orig: Exception) -> DBAPIError: + """Construct a SQLAlchemy DBAPIError wrapping ``orig`` (matches runtime shape).""" + return DBAPIError(statement="", params=None, orig=orig) + + +# ---------------------------------------------------------------------- +# DSN resolution. + + +def test_resolve_dsn_prefers_config_over_env(monkeypatch: pytest.MonkeyPatch) -> None: + # Arrange + monkeypatch.setenv("DB_URL", "postgresql://env-host/db") + cfg = _config("postgresql://cfg-host/db") + + # Act + dsn = migrations_module._resolve_dsn(cfg) + + # Assert + assert dsn == "postgresql://cfg-host/db" + + +def test_resolve_dsn_falls_back_to_env(monkeypatch: pytest.MonkeyPatch) -> None: + # Arrange + monkeypatch.setenv("DB_URL", "postgresql://env-host/db") + cfg = _config("") + + # Act + dsn = migrations_module._resolve_dsn(cfg) + + # Assert + assert dsn == "postgresql://env-host/db" + + +def test_resolve_dsn_raises_when_no_source(monkeypatch: pytest.MonkeyPatch) -> None: + # Arrange + monkeypatch.delenv("DB_URL", raising=False) + cfg = _config("") + + # Act + Assert + with pytest.raises(MigrationError, match="no DSN available"): + migrations_module._resolve_dsn(cfg) + + +def test_to_sqlalchemy_url_rewrites_postgresql_prefix() -> None: + # Act + Assert + assert migrations_module._to_sqlalchemy_url("postgresql://h/db") == "postgresql+psycopg://h/db" + # Already-prefixed URLs pass through unchanged. + assert ( + migrations_module._to_sqlalchemy_url("postgresql+psycopg://h/db") + == "postgresql+psycopg://h/db" + ) + + +# ---------------------------------------------------------------------- +# NFR-reliability-retry: one SerializationFailure → retry → succeed. + + +def test_nfr_reliability_retry_once_on_serialization_failure( + monkeypatch: pytest.MonkeyPatch, +) -> None: + # Arrange — stub all DB-touching internals so the retry path is the only + # variable. Both iterations see the DB BELOW head pre-upgrade; the first + # upgrade attempt raises SQLSTATE 40001, the second succeeds. Post-upgrade + # revision is read once after the successful attempt. + rev_sequence = iter([None, None, "0002_c6_tile_identity_and_lru"]) + upgrade_calls: list[int] = [] + + def fake_upgrade(_cfg: Any, _target: str) -> None: + upgrade_calls.append(len(upgrade_calls)) + if len(upgrade_calls) == 1: + raise _make_dbapi_error(_FakeSerializationFailure("conflict")) + + def fake_current_revision(_url: str) -> str | None: + return next(rev_sequence) + + monkeypatch.setattr( + "gps_denied_onboard.components.c6_tile_cache.migrations.command.upgrade", + fake_upgrade, + ) + monkeypatch.setattr(migrations_module, "_current_revision", fake_current_revision) + monkeypatch.setattr( + migrations_module, + "_head_revision", + lambda _cfg: "0002_c6_tile_identity_and_lru", + ) + monkeypatch.setattr( + migrations_module, + "_resolve_applied", + lambda _cfg, _pre, _post: ["0002_c6_tile_identity_and_lru"], + ) + + # Act + result = apply_migrations(_config()) + + # Assert + assert isinstance(result, MigrationResult) + assert result.applied == ["0002_c6_tile_identity_and_lru"] + assert result.no_op is False + assert len(upgrade_calls) == 2, "runner must retry on SQLSTATE 40001" + + +def test_nfr_reliability_terminal_after_two_serialization_failures( + monkeypatch: pytest.MonkeyPatch, +) -> None: + # Arrange + def fake_upgrade(_cfg: Any, _target: str) -> None: + raise _make_dbapi_error(_FakeSerializationFailure("persistent conflict")) + + monkeypatch.setattr( + "gps_denied_onboard.components.c6_tile_cache.migrations.command.upgrade", + fake_upgrade, + ) + monkeypatch.setattr(migrations_module, "_current_revision", lambda _url: None) + monkeypatch.setattr( + migrations_module, + "_head_revision", + lambda _cfg: "0002_c6_tile_identity_and_lru", + ) + + # Act + Assert + with pytest.raises(MigrationError, match="database error"): + apply_migrations(_config()) + + +def test_nfr_reliability_does_not_retry_on_non_serialization_dbapi_error( + monkeypatch: pytest.MonkeyPatch, +) -> None: + # Arrange + upgrade_calls: list[int] = [] + + class _GenericDbApiOrig(Exception): + sqlstate = "42P01" # "undefined_table" — NOT a serialization conflict + + def fake_upgrade(_cfg: Any, _target: str) -> None: + upgrade_calls.append(len(upgrade_calls)) + raise _make_dbapi_error(_GenericDbApiOrig("missing relation")) + + monkeypatch.setattr( + "gps_denied_onboard.components.c6_tile_cache.migrations.command.upgrade", + fake_upgrade, + ) + monkeypatch.setattr(migrations_module, "_current_revision", lambda _url: None) + monkeypatch.setattr( + migrations_module, + "_head_revision", + lambda _cfg: "0002_c6_tile_identity_and_lru", + ) + + # Act + Assert + with pytest.raises(MigrationError): + apply_migrations(_config()) + assert len(upgrade_calls) == 1, "non-40001 errors must fail-fast (no retry)" + + +def test_apply_migrations_logs_no_op_when_already_at_head( + monkeypatch: pytest.MonkeyPatch, +) -> None: + # Arrange + monkeypatch.setattr( + migrations_module, "_current_revision", lambda _url: "0002_c6_tile_identity_and_lru" + ) + monkeypatch.setattr( + migrations_module, "_head_revision", lambda _cfg: "0002_c6_tile_identity_and_lru" + ) + + # Act + result = apply_migrations(_config()) + + # Assert + assert result == MigrationResult( + applied=[], current_revision="0002_c6_tile_identity_and_lru", no_op=True + ) + + +def test_migration_error_is_not_a_tile_cache_error_member() -> None: + """`MigrationError` deliberately lives outside the TileCacheError family.""" + # Arrange + from gps_denied_onboard.components.c6_tile_cache.errors import TileCacheError + + # Act + Assert + assert not issubclass(MigrationError, TileCacheError) diff --git a/tests/unit/c6_tile_cache/test_postgres_schema.py b/tests/unit/c6_tile_cache/test_postgres_schema.py new file mode 100644 index 0000000..ce65480 --- /dev/null +++ b/tests/unit/c6_tile_cache/test_postgres_schema.py @@ -0,0 +1,678 @@ +"""AZ-304 — Schema-shape diff + per-AC integration tests against a real Postgres. + +All tests in this module are ``@pytest.mark.docker`` (via the +module-level ``pytestmark``); they are auto-skipped on Tier-1 by +``tests/conftest.py`` so the project-wide unit suite stays hermetic. To +run locally: ``docker compose -f docker-compose.test.yml up -d db && \ +GPS_DENIED_TIER=2 DB_URL=postgresql://gps_denied:dev@localhost:5432/gps_denied \ +pytest tests/unit/c6_tile_cache/test_postgres_schema.py``. +""" + +from __future__ import annotations + +import dataclasses +import logging +import os +import time +from collections.abc import Iterator +from pathlib import Path +from uuid import UUID, uuid4 + +import psycopg +import pytest +from alembic import command +from alembic.config import Config as AlembicConfig + +from gps_denied_onboard.components.c6_tile_cache._uuid_namespace import ( + TILE_NAMESPACE_UUID, + derive_location_hash, + derive_tile_id, +) +from gps_denied_onboard.components.c6_tile_cache.config import C6TileCacheConfig +from gps_denied_onboard.components.c6_tile_cache.migrations import ( + MigrationResult, + apply_migrations, +) +from gps_denied_onboard.config.schema import Config + +pytestmark = pytest.mark.docker + + +_PROJECT_ROOT = Path(__file__).resolve().parents[3] +_ALEMBIC_INI = _PROJECT_ROOT / "alembic.ini" +_ALEMBIC_SCRIPT_LOCATION = _PROJECT_ROOT / "db" / "migrations" + +_AZ263_REV = "0001_initial" +_AZ304_REV = "0002_c6_tile_identity_and_lru" + +_FRESH_BAG_TABLES = ( + "tile_freshness_rules", + "engine_cache_entries", + "manifests", + "tiles", + "sector_classifications", + "flights", + "alembic_version", +) + +_AZ263_TILE_COLUMNS = { + "id", + "zoom_level", + "tile_x", + "tile_y", + "latitude", + "longitude", + "tile_size_meters", + "tile_size_pixels", + "capture_timestamp", + "compression", + "crs", + "source", + "flight_id", + "companion_id", + "tile_quality_metadata", + "voting_status", + "freshness_status", + "signature", + "created_at", + "updated_at", +} + +_AZ304_TILE_COLUMNS = { + "tile_uuid", + "location_hash", + "content_sha256", + "disk_bytes", + "accessed_at", + "uploaded_at", +} + +_AZ304_TILE_INDICES = { + "idx_tiles_natural_key", + "idx_tiles_location_hash", + "idx_tiles_accessed_at", + "idx_tiles_pending_upload", + "idx_tiles_flight_captured", +} + +_AZ263_TILE_INDICES = { + "ix_tiles_zxy", + "ix_tiles_lat_lon", + "ix_tiles_voting_status_onboard", + "ix_tiles_flight_id", + "ix_tiles_created_at", +} + + +def _to_sqlalchemy_url(raw_dsn: str) -> str: + if raw_dsn.startswith("postgresql://"): + return raw_dsn.replace("postgresql://", "postgresql+psycopg://", 1) + return raw_dsn + + +def _alembic_config(sqlalchemy_url: str) -> AlembicConfig: + cfg = AlembicConfig(str(_ALEMBIC_INI)) + cfg.set_main_option("script_location", str(_ALEMBIC_SCRIPT_LOCATION)) + cfg.set_main_option("sqlalchemy.url", sqlalchemy_url) + return cfg + + +def _build_config(dsn: str) -> Config: + block = C6TileCacheConfig(postgres_dsn=dsn) + return Config.with_blocks(c6_tile_cache=block) + + +def _exec(conn: psycopg.Connection, sql: str, params: tuple[object, ...] | None = None) -> None: + with conn.cursor() as cur: + cur.execute(sql, params or ()) + + +def _fetchone( + conn: psycopg.Connection, sql: str, params: tuple[object, ...] | None = None +) -> tuple[object, ...] | None: + with conn.cursor() as cur: + cur.execute(sql, params or ()) + return cur.fetchone() + + +def _fetchall( + conn: psycopg.Connection, sql: str, params: tuple[object, ...] | None = None +) -> list[tuple[object, ...]]: + with conn.cursor() as cur: + cur.execute(sql, params or ()) + return cur.fetchall() + + +def _column_metadata(conn: psycopg.Connection, table: str) -> dict[str, tuple[object, ...]]: + rows = _fetchall( + conn, + """ + SELECT column_name, data_type, is_nullable, column_default + FROM information_schema.columns + WHERE table_schema = 'public' AND table_name = %s + """, + (table,), + ) + return {str(row[0]): (row[1], row[2], row[3]) for row in rows} + + +def _index_names(conn: psycopg.Connection, table: str) -> set[str]: + rows = _fetchall( + conn, + """ + SELECT indexname FROM pg_indexes + WHERE schemaname = 'public' AND tablename = %s + """, + (table,), + ) + return {str(row[0]) for row in rows} + + +def _check_constraint_source(conn: psycopg.Connection, name: str) -> str | None: + row = _fetchone( + conn, + """ + SELECT pg_get_constraintdef(c.oid) + FROM pg_constraint c + WHERE c.contype = 'c' AND c.conname = %s + """, + (name,), + ) + if row is None: + return None + value = row[0] + return None if value is None else str(value) + + +@pytest.fixture +def db_url() -> str: + url = os.environ.get("DB_URL") + if not url: + pytest.skip("DB_URL not set — start docker-compose.test.yml `db` service first") + return url + + +@pytest.fixture +def fresh_db(db_url: str) -> Iterator[str]: + """Drop all c6 tables + alembic_version; yield the same DSN.""" + with psycopg.connect(db_url, autocommit=True) as conn: + tables = ", ".join(_FRESH_BAG_TABLES) + _exec(conn, f"DROP TABLE IF EXISTS {tables} CASCADE") + yield db_url + # Leave the DB dirty after each test; the next test's fresh_db drops again. + + +@pytest.fixture +def baselined_db(fresh_db: str) -> str: + """Apply only 0001_initial; return DB at AZ-263 head.""" + sa_url = _to_sqlalchemy_url(fresh_db) + cfg = _alembic_config(sa_url) + command.upgrade(cfg, _AZ263_REV) + return fresh_db + + +# ---------------------------------------------------------------------- +# AC-1 / AC-9: apply on AZ-263-baselined DB; AZ-263 columns unchanged. + + +def test_ac1_apply_creates_additive_artifacts(baselined_db: str) -> None: + # Arrange + config = _build_config(baselined_db) + + # Act + result = apply_migrations(config) + + # Assert: runner result + assert result.applied == [_AZ304_REV] + assert result.current_revision == _AZ304_REV + assert result.no_op is False + + # Assert: schema artifacts + with psycopg.connect(baselined_db) as conn: + tiles_cols = _column_metadata(conn, "tiles") + for col in _AZ304_TILE_COLUMNS: + assert col in tiles_cols, f"tiles.{col} missing post-0002" + for col in _AZ263_TILE_COLUMNS: + assert col in tiles_cols, f"AZ-263 tiles.{col} dropped by 0002 — regression" + + tiles_idx = _index_names(conn, "tiles") + for idx in _AZ263_TILE_INDICES | _AZ304_TILE_INDICES: + assert idx in tiles_idx, f"tiles index {idx!r} missing" + + sector_cols = _column_metadata(conn, "sector_classifications") + for col in ("min_lat", "min_lon", "max_lat", "max_lon"): + assert col in sector_cols, f"sector_classifications.{col} missing" + + rules_rows = _fetchall( + conn, "SELECT classification, max_age_seconds, action FROM tile_freshness_rules" + ) + assert len(rules_rows) == 2 + + +def test_ac9_az263_columns_byte_identical(baselined_db: str) -> None: + """AZ-263 column metadata is byte-identical pre- and post-0002.""" + # Arrange: snapshot pre-migration column metadata. + with psycopg.connect(baselined_db) as conn: + before = _column_metadata(conn, "tiles") + before_flights = _column_metadata(conn, "flights") + before_sector = _column_metadata(conn, "sector_classifications") + + # Act + apply_migrations(_build_config(baselined_db)) + + # Assert + with psycopg.connect(baselined_db) as conn: + after = _column_metadata(conn, "tiles") + after_flights = _column_metadata(conn, "flights") + after_sector = _column_metadata(conn, "sector_classifications") + + for col, meta in before.items(): + assert after.get(col) == meta, f"tiles.{col} drifted: {meta} -> {after.get(col)}" + assert after_flights == before_flights, "flights columns drifted" + # sector_classifications has new NULLable columns; assert the AZ-263 ones survived. + for col, meta in before_sector.items(): + assert after_sector.get(col) == meta, f"sector_classifications.{col} drifted" + + +# ---------------------------------------------------------------------- +# AC-2: no-op at head. + + +def test_ac2_apply_is_noop_at_head(baselined_db: str) -> None: + # Arrange + config = _build_config(baselined_db) + apply_migrations(config) + + # Act + second = apply_migrations(config) + + # Assert + assert second.applied == [] + assert second.no_op is True + assert second.current_revision == _AZ304_REV + + +# ---------------------------------------------------------------------- +# AC-3: widened freshness_status CHECK + new CHECKs exist. + + +def test_ac3_freshness_check_widened(baselined_db: str) -> None: + # Act + apply_migrations(_build_config(baselined_db)) + + # Assert + with psycopg.connect(baselined_db) as conn: + defn = _check_constraint_source(conn, "ck_tiles_freshness_status") + assert defn is not None + for value in ( + "'fresh'", + "'stale_warn'", + "'stale_reject'", + "'stale_active_conflict'", + "'stale_rear'", + "'downgraded'", + ): + assert value in defn, f"widened CHECK missing {value}; got: {defn}" + + +def test_ac3_new_check_constraints_present(baselined_db: str) -> None: + # Act + apply_migrations(_build_config(baselined_db)) + + # Assert + with psycopg.connect(baselined_db) as conn: + sha_defn = _check_constraint_source(conn, "ck_tiles_content_sha256_len") + bytes_defn = _check_constraint_source(conn, "ck_tiles_disk_bytes_nonneg") + action_defn = _check_constraint_source(conn, "ck_tfr_action") + assert sha_defn is not None and "length(content_sha256) = 64" in sha_defn + assert bytes_defn is not None and "disk_bytes >= 0" in bytes_defn + assert action_defn is not None and "reject" in action_defn and "downgrade" in action_defn + + +# ---------------------------------------------------------------------- +# AC-4 + AC-4b: natural-key UNIQUE allows per-flight separation, rejects duplicates. + + +def _insert_tile( + conn: psycopg.Connection, + *, + zoom_level: int, + tile_x: int, + tile_y: int, + source: str, + flight_id: UUID | None, + content_sha256: str, + flight_table_id: UUID | None = None, +) -> None: + """Direct INSERT used by AC-4/AC-4b tests (no PostgresFilesystemStore yet).""" + tile_uuid = derive_tile_id(zoom_level, tile_x, tile_y, source, flight_id) + location_hash = derive_location_hash(zoom_level, tile_x, tile_y) + _exec( + conn, + """ + INSERT INTO tiles ( + zoom_level, tile_x, tile_y, latitude, longitude, + tile_size_meters, tile_size_pixels, capture_timestamp, source, + flight_id, + tile_uuid, location_hash, content_sha256, disk_bytes + ) VALUES ( + %s, %s, %s, 0.0, 0.0, + 256.0, 256, now(), %s, + %s, + %s, %s, %s, 1024 + ) + """, + ( + zoom_level, + tile_x, + tile_y, + source, + flight_table_id if flight_table_id is not None else flight_id, + tile_uuid, + location_hash, + content_sha256, + ), + ) + + +def test_ac4_natural_key_allows_different_flights_same_cell(baselined_db: str) -> None: + # Arrange + apply_migrations(_build_config(baselined_db)) + flight_a, flight_b = uuid4(), uuid4() + + # Act + with psycopg.connect(baselined_db, autocommit=True) as conn: + _exec( + conn, + "INSERT INTO flights (id, companion_id, started_at) VALUES (%s, 'comp', now()), (%s, 'comp', now())", + (flight_a, flight_b), + ) + _insert_tile( + conn, + zoom_level=18, + tile_x=10, + tile_y=20, + source="onboard_ingest", + flight_id=flight_a, + content_sha256="a" * 64, + ) + _insert_tile( + conn, + zoom_level=18, + tile_x=10, + tile_y=20, + source="onboard_ingest", + flight_id=flight_b, + content_sha256="b" * 64, + ) + + # Assert + rows = _fetchall( + conn, + "SELECT tile_uuid, location_hash FROM tiles WHERE tile_x=10 AND tile_y=20", + ) + assert len(rows) == 2 + tile_uuids = {row[0] for row in rows} + location_hashes = {row[1] for row in rows} + assert len(tile_uuids) == 2, "per-flight tile_uuid collision" + assert len(location_hashes) == 1, "location_hash should match across flights" + + +def test_ac4b_natural_key_rejects_duplicate_flight_insert(baselined_db: str) -> None: + # Arrange + apply_migrations(_build_config(baselined_db)) + + # Act + Assert + with psycopg.connect(baselined_db, autocommit=True) as conn: + _insert_tile( + conn, + zoom_level=18, + tile_x=30, + tile_y=40, + source="googlemaps", + flight_id=None, + content_sha256="c" * 64, + ) + with pytest.raises(psycopg.errors.UniqueViolation): + # Same natural key (both flight_id=NULL → both coalesce to zero UUID). + # Use a different content_sha256 so the rejection comes from the + # natural-key index, not a coincidental UNIQUE elsewhere. + _exec( + conn, + """ + INSERT INTO tiles ( + zoom_level, tile_x, tile_y, latitude, longitude, + tile_size_meters, tile_size_pixels, capture_timestamp, source, + tile_uuid, location_hash, content_sha256, disk_bytes + ) VALUES ( + 18, 30, 40, 0.0, 0.0, + 256.0, 256, now(), 'googlemaps', + %s, %s, %s, 1024 + ) + """, + (uuid4(), uuid4(), "d" * 64), + ) + + +# ---------------------------------------------------------------------- +# AC-5: widened CHECK accepts all six values; rejects bogus. + + +@pytest.mark.parametrize( + "freshness_value", + [ + "fresh", + "stale_warn", + "stale_reject", + "stale_active_conflict", + "stale_rear", + "downgraded", + ], +) +def test_ac5_widened_check_accepts_union_values(baselined_db: str, freshness_value: str) -> None: + # Arrange + apply_migrations(_build_config(baselined_db)) + + # Act + with psycopg.connect(baselined_db, autocommit=True) as conn: + _exec( + conn, + """ + INSERT INTO tiles ( + zoom_level, tile_x, tile_y, latitude, longitude, + tile_size_meters, tile_size_pixels, capture_timestamp, source, + tile_uuid, location_hash, content_sha256, disk_bytes, + freshness_status + ) VALUES ( + 18, 1, 1, 0.0, 0.0, + 256.0, 256, now(), 'googlemaps', + %s, %s, %s, 1024, + %s + ) + """, + (uuid4(), uuid4(), "e" * 64, freshness_value), + ) + # Assert + rows = _fetchall( + conn, + "SELECT freshness_status FROM tiles WHERE freshness_status = %s", + (freshness_value,), + ) + assert any(row[0] == freshness_value for row in rows) + + +def test_ac5_widened_check_rejects_bogus(baselined_db: str) -> None: + # Arrange + apply_migrations(_build_config(baselined_db)) + + # Act + Assert + with psycopg.connect(baselined_db, autocommit=True) as conn: + with pytest.raises(psycopg.errors.CheckViolation): + _exec( + conn, + """ + INSERT INTO tiles ( + zoom_level, tile_x, tile_y, latitude, longitude, + tile_size_meters, tile_size_pixels, capture_timestamp, source, + tile_uuid, location_hash, content_sha256, disk_bytes, + freshness_status + ) VALUES ( + 18, 2, 2, 0.0, 0.0, + 256.0, 256, now(), 'googlemaps', + %s, %s, %s, 1024, + 'bogus' + ) + """, + (uuid4(), uuid4(), "f" * 64), + ) + + +# ---------------------------------------------------------------------- +# AC-6: down migration reverses cleanly; subsequent upgrade re-applies. + + +def test_ac6_downgrade_reverses_cleanly(baselined_db: str) -> None: + # Arrange + apply_migrations(_build_config(baselined_db)) + sa_url = _to_sqlalchemy_url(baselined_db) + cfg = _alembic_config(sa_url) + + # Act: downgrade one revision + command.downgrade(cfg, "-1") + + # Assert: AZ-304 artifacts gone, AZ-263 baseline intact. + with psycopg.connect(baselined_db) as conn: + tiles_cols = _column_metadata(conn, "tiles") + for col in _AZ304_TILE_COLUMNS: + assert col not in tiles_cols, f"tiles.{col} should be dropped after downgrade" + for col in _AZ263_TILE_COLUMNS: + assert col in tiles_cols, f"AZ-263 tiles.{col} dropped by downgrade — regression" + + tile_table = _fetchone( + conn, + "SELECT to_regclass('public.tile_freshness_rules')", + ) + assert tile_table is not None and tile_table[0] is None + + defn = _check_constraint_source(conn, "ck_tiles_freshness_status") + assert defn is not None + # AZ-263 vocabulary only. + assert "'stale_active_conflict'" not in defn + assert "'stale_rear'" not in defn + assert "'downgraded'" not in defn + + # Act: re-upgrade + command.upgrade(cfg, "head") + + # Assert: clean re-apply + with psycopg.connect(baselined_db) as conn: + tiles_cols = _column_metadata(conn, "tiles") + for col in _AZ304_TILE_COLUMNS: + assert col in tiles_cols + + +# ---------------------------------------------------------------------- +# AC-7: seed rows present with documented values. + + +def test_ac7_freshness_rules_seeded(baselined_db: str) -> None: + # Act + apply_migrations(_build_config(baselined_db)) + + # Assert + with psycopg.connect(baselined_db) as conn: + rows = _fetchall( + conn, + "SELECT classification, max_age_seconds, action FROM tile_freshness_rules ORDER BY classification", + ) + assert rows == [ + ("active_conflict", 15552000, "reject"), + ("stable_rear", 31104000, "downgrade"), + ] + + +# ---------------------------------------------------------------------- +# AC-8: log INFO records carry kind / namespace_uuid. + + +def test_ac8_apply_logs_kind_applied(baselined_db: str, caplog: pytest.LogCaptureFixture) -> None: + # Arrange + caplog.set_level(logging.INFO, logger="c6_tile_cache.migrations") + + # Act + apply_migrations(_build_config(baselined_db)) + + # Assert + applied_records = [ + r for r in caplog.records if getattr(r, "kind", None) == "c6.migration.applied" + ] + assert len(applied_records) == 1 + kv = getattr(applied_records[0], "kv", {}) + assert kv.get("revisions") == [_AZ304_REV] + assert kv.get("namespace_uuid") == str(TILE_NAMESPACE_UUID) + + +def test_ac8_noop_logs_kind_no_op(baselined_db: str, caplog: pytest.LogCaptureFixture) -> None: + # Arrange + apply_migrations(_build_config(baselined_db)) # first apply + caplog.clear() + caplog.set_level(logging.INFO, logger="c6_tile_cache.migrations") + + # Act + apply_migrations(_build_config(baselined_db)) # second = no-op + + # Assert + noop_records = [r for r in caplog.records if getattr(r, "kind", None) == "c6.migration.no_op"] + assert len(noop_records) == 1 + kv = getattr(noop_records[0], "kv", {}) + assert kv.get("current_revision") == _AZ304_REV + assert kv.get("namespace_uuid") == str(TILE_NAMESPACE_UUID) + + +# ---------------------------------------------------------------------- +# NFR-perf-apply / NFR-perf-noop: timing budgets. + + +def test_nfr_perf_apply_under_5s(baselined_db: str) -> None: + # Arrange + config = _build_config(baselined_db) + + # Act + t0 = time.perf_counter() + apply_migrations(config) + elapsed = time.perf_counter() - t0 + + # Assert + assert elapsed < 5.0, f"apply took {elapsed:.3f}s (>5s budget)" + + +def test_nfr_perf_noop_under_100ms(baselined_db: str) -> None: + # Arrange + config = _build_config(baselined_db) + apply_migrations(config) + + # Act + t0 = time.perf_counter() + apply_migrations(config) + elapsed = time.perf_counter() - t0 + + # Assert + assert elapsed < 0.100, f"no-op took {elapsed * 1000:.1f}ms (>100ms budget)" + + +# ---------------------------------------------------------------------- +# Smoke: MigrationResult is frozen. + + +def test_migration_result_is_frozen() -> None: + # Arrange + result = MigrationResult(applied=["x"], current_revision="x", no_op=False) + + # Act + Assert + with pytest.raises((dataclasses.FrozenInstanceError, AttributeError)): + result.no_op = True # type: ignore[misc] + + +# AC-12 (`TileMetadata.location_hash` default = None) is covered in the +# AZ-303 protocol-conformance suite (`test_protocol_conformance.py`); no +# Postgres needed. diff --git a/tests/unit/c6_tile_cache/test_uuid_namespace.py b/tests/unit/c6_tile_cache/test_uuid_namespace.py new file mode 100644 index 0000000..d42139e --- /dev/null +++ b/tests/unit/c6_tile_cache/test_uuid_namespace.py @@ -0,0 +1,177 @@ +"""AZ-304 AC-10 / AC-11 — UUIDv5 namespace determinism + cross-repo coordination. + +The expected UUIDs locked below are the *cross-repo coordination +evidence* between ``gps-denied-onboard`` (Python ``uuid.uuid5``) and +``satellite-provider`` (C# Guid.NewGuid v5 implementation per +``AZ-TBD_tile_identity_uuidv5_bulk_list``). Both sides MUST emit +byte-identical UUIDs for these input vectors; changing any expected +value here without a coordinated cross-workspace release breaks the +correlation key joining the two systems. +""" + +from __future__ import annotations + +from uuid import UUID + +import pytest + +from gps_denied_onboard.components.c6_tile_cache._types import TileSource +from gps_denied_onboard.components.c6_tile_cache._uuid_namespace import ( + TILE_NAMESPACE_UUID, + derive_location_hash, + derive_tile_id, +) + +_EXPECTED_NAMESPACE = UUID("5b8d0c2e-1a4f-4b3a-8c9d-e7f6a3b2c1d0") + + +# AC-10 — five locked tile-id vectors. (z, x, y, source, flight_id) -> expected uuidv5. +_TILE_ID_VECTORS: list[tuple[int, int, int, str, str | None, str]] = [ + (18, 72346, 46342, "googlemaps", None, "6f49531b-1351-55ba-b733-66d3f1fca1a5"), + ( + 18, + 72346, + 46342, + "onboard_ingest", + "11111111-2222-3333-4444-555555555555", + "c7f6eda4-3b95-5818-a0b7-1aa8cbb5aa95", + ), + (10, 300, 200, "googlemaps", None, "3604dd59-1018-5889-97dc-ba5635761ac5"), + ( + 21, + 999999, + 999999, + "onboard_ingest", + "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", + "36c6e0c0-54a1-56ab-9dfd-a4f8f184cb22", + ), + (15, 1000, 2000, "googlemaps", None, "955df722-8d4e-5375-8a23-4f45dc16fef1"), +] + + +# AC-11 — four locked location-hash vectors. (z, x, y) -> expected uuidv5. +_LOCATION_HASH_VECTORS: list[tuple[int, int, int, str]] = [ + (18, 72346, 46342, "e95c7edb-550e-58eb-8f94-3056f73a57d3"), + (10, 300, 200, "76aa22b7-fd8e-5089-8b20-c45fb4a0f5e8"), + (21, 999999, 999999, "4337a27e-f118-524f-8d74-82cf9295c632"), + (15, 1000, 2000, "0501b2fc-0fc8-5330-a407-c7ccbf1fb9c7"), +] + + +# ---------------------------------------------------------------------- +# AC-10: namespace + derivation are deterministic and locked. + + +def test_ac10_namespace_uuid_locked() -> None: + # Assert + assert TILE_NAMESPACE_UUID == _EXPECTED_NAMESPACE + assert TILE_NAMESPACE_UUID.version == 4 # the namespace itself was a one-time UUIDv4 pick + assert str(TILE_NAMESPACE_UUID) == "5b8d0c2e-1a4f-4b3a-8c9d-e7f6a3b2c1d0" + + +@pytest.mark.parametrize("z,x,y,source,flight_id,expected", _TILE_ID_VECTORS) +def test_ac10_derive_tile_id_locked_vectors( + z: int, x: int, y: int, source: str, flight_id: str | None, expected: str +) -> None: + # Act + result = derive_tile_id(z, x, y, source, flight_id) + + # Assert + assert result == UUID(expected) + assert result.version == 5 + + +@pytest.mark.parametrize("z,x,y,source,flight_id,expected", _TILE_ID_VECTORS) +def test_ac10_derive_tile_id_idempotent_on_second_call( + z: int, x: int, y: int, source: str, flight_id: str | None, expected: str +) -> None: + # Act + first = derive_tile_id(z, x, y, source, flight_id) + second = derive_tile_id(z, x, y, source, flight_id) + + # Assert + assert first == second == UUID(expected) + + +def test_ac10_derive_tile_id_accepts_tile_source_enum() -> None: + """Passing a `TileSource` enum yields the same UUID as its string value.""" + # Act + from_enum = derive_tile_id(18, 72346, 46342, TileSource.GOOGLEMAPS, None) + from_str = derive_tile_id(18, 72346, 46342, "googlemaps", None) + + # Assert + assert from_enum == from_str + assert from_enum == UUID("6f49531b-1351-55ba-b733-66d3f1fca1a5") + + +def test_ac10_derive_tile_id_accepts_uuid_flight_id() -> None: + """Passing a `UUID` instance for ``flight_id`` matches the string form.""" + # Arrange + flight_uuid_str = "11111111-2222-3333-4444-555555555555" + + # Act + from_uuid = derive_tile_id(18, 72346, 46342, "onboard_ingest", UUID(flight_uuid_str)) + from_str = derive_tile_id(18, 72346, 46342, "onboard_ingest", flight_uuid_str) + + # Assert + assert from_uuid == from_str + assert from_uuid == UUID("c7f6eda4-3b95-5818-a0b7-1aa8cbb5aa95") + + +def test_ac10_derive_tile_id_rejects_unknown_source_type() -> None: + # Act + Assert + with pytest.raises(TypeError, match="source must be"): + derive_tile_id(18, 0, 0, 123, None) + + +def test_ac10_derive_tile_id_rejects_unknown_flight_id_type() -> None: + # Act + Assert + with pytest.raises(TypeError, match="flight_id must be"): + derive_tile_id(18, 0, 0, "googlemaps", 12345) + + +def test_ac10_derive_tile_id_rejects_malformed_flight_id_string() -> None: + # Act + Assert + with pytest.raises(ValueError): + derive_tile_id(18, 0, 0, "googlemaps", "not-a-uuid") + + +# ---------------------------------------------------------------------- +# AC-11: location_hash is invariant across source / flight_id. + + +@pytest.mark.parametrize("z,x,y,expected", _LOCATION_HASH_VECTORS) +def test_ac11_derive_location_hash_locked_vectors(z: int, x: int, y: int, expected: str) -> None: + # Act + result = derive_location_hash(z, x, y) + + # Assert + assert result == UUID(expected) + assert result.version == 5 + + +def test_ac11_location_hash_invariant_across_source_and_flight() -> None: + """Same (z, x, y) yields the same location_hash for any source + flight_id.""" + # Arrange + cell = (18, 72346, 46342) + cell_lh = derive_location_hash(*cell) + + # Act + Assert — `derive_tile_id` produces _different_ tile UUIDs for + # different source/flight combos but the cell-bag is the same; this is + # explicitly tested via the locked vectors above. Here we only need to + # confirm that `location_hash` NEVER depends on these inputs. + for _source in ("googlemaps", "onboard_ingest"): + for _flight_id in (None, "11111111-2222-3333-4444-555555555555"): + assert derive_location_hash(*cell) == cell_lh + + +def test_ac11_different_cells_yield_different_location_hashes() -> None: + # Act + lh1 = derive_location_hash(18, 72346, 46342) + lh2 = derive_location_hash(18, 72346, 46343) # neighbour cell + lh3 = derive_location_hash(19, 72346, 46342) # same x,y at different zoom + + # Assert + assert lh1 != lh2 + assert lh1 != lh3 + assert lh2 != lh3 diff --git a/tests/unit/test_ac5_alembic.py b/tests/unit/test_ac5_alembic.py index 353b1c2..6059631 100644 --- a/tests/unit/test_ac5_alembic.py +++ b/tests/unit/test_ac5_alembic.py @@ -20,7 +20,14 @@ REPO_ROOT = Path(__file__).resolve().parents[2] MIGRATION_BODY = (REPO_ROOT / "db" / "migrations" / "versions" / "0001_initial.py").read_text() -def test_head_revision_is_0001_initial() -> None: +def test_head_revision_matches_latest_migration() -> None: + """Asserts the Alembic head tracks the latest migration on disk. + + AZ-263 originally pinned this to ``0001_initial``; AZ-304 advanced the head + to ``0002_c6_tile_identity_and_lru`` (additive on AZ-263 — see + ``_docs/02_tasks/todo/AZ-304_c6_postgres_schema.md``). Future migrations + update this assertion in lockstep with the new head. + """ # Arrange cwd = os.getcwd() os.chdir(REPO_ROOT) @@ -33,7 +40,7 @@ def test_head_revision_is_0001_initial() -> None: os.chdir(cwd) # Assert - assert list(heads) == ["0001_initial"], f"unexpected heads: {heads}" + assert list(heads) == ["0002_c6_tile_identity_and_lru"], f"unexpected heads: {heads}" @pytest.mark.parametrize(