Adds the production PostgresFilesystemStore implementing both protocols in a single class. Filesystem-backed JPEG I/O (atomic sidecar write, read-only mmap) + Postgres-backed metadata (spatial bbox, LRU, voting, upload bookkeeping). Wires composition via `from_config` classmethod. Key behaviors: - AC-3 strict reading: INSERT runs first inside an open transaction; duplicate-key collisions raise `TileMetadataError` BEFORE any byte is written, leaving the original file + sidecar byte-identical. Atomic sidecar write happens inside the same transaction; commit closes it. Comp-delete remains as a safety net for the rare commit-after-write failure path. - AC-2 content-hash gate runs before any I/O. - Construction performs an orphan-file reconciliation scan and emits an INFO `c6.store.construct` log with steady-state stats. Adds `c6.write` and `c6.write_failed` FDR record kinds (schema v1.1.0, forward-compatible) and a thin operator CLI at `c6_tile_cache.tools dump` for inspection. Dependencies: adds `psycopg-pool>=3.2,<4.0` for the connection pool used on the F3 read-hot path. Tests: 25 new tests for c6_tile_cache cover AC-1..AC-15 plus MmapTilePixelHandle + helper round-trips. Full Tier-2 unit suite passes (1215 passed, 8 skipped, 1 pre-existing unrelated failure `test_ac8_read_host_tuple_on_jetson` — missing `pynvml` on macOS, Jetson-only). Co-authored-by: Cursor <cursoragent@cursor.com>
9.4 KiB
Batch 28 / Cycle 1 — Implementation Report
Date: 2026-05-12 Tasks: AZ-305 (C6 PostgresFilesystemStore — TileStore + TileMetadataStore production impl) Story points landed: 5 Status: complete (AZ-305 → In Testing)
Scope summary
Single-task batch landing the production PostgresFilesystemStore — the
single class that satisfies BOTH TileStore (filesystem-backed JPEG I/O
byte-identical to satellite-provider) and TileMetadataStore
(Postgres-backed spatial / LRU / voting state). Owns the full insert
path (atomic-write + SHA-256 sidecar via AZ-280, content-hash gate,
single-transaction row insert, compensating delete on failure), the read
path (MmapTilePixelHandle read-only mmap, btree-indexed bbox query,
LRU access stamp), and bookkeeping (mark_uploaded,
update_voting_status, lru_candidates, total_disk_bytes). Wires the
freshness-gate call site (pass-through hook for AZ-307 to replace) and
exposes the LRU primitives AZ-308 will consume.
The class is invoked from storage_factory via a new from_config
classmethod that resolves the psycopg_pool.ConnectionPool, the
producer-local FdrClient (via make_fdr_client), and the project
logger. __init__ itself takes explicit injected dependencies so unit
tests can substitute the FakeFdrSink, a tmp_path root, and a
test-managed pool without touching the composition root.
Files added / modified
New (production)
src/gps_denied_onboard/components/c6_tile_cache/postgres_filesystem_store.py—MmapTilePixelHandle(read-onlyPROT_READmmap returning a.toreadonly()memoryview);PostgresFilesystemStorewith explicit dependency-injected__init__and afrom_configclassmethod for the composition root. Implementsread_tile_pixels,write_tile,tile_exists,delete_tile,query_by_bbox,insert_metadata,update_voting_status,mark_uploaded,pending_uploads,record_lru_access,lru_candidates,total_disk_bytes,get_by_id. All third-party exceptions (psycopg.Error,OSError,Sha256SidecarError) are rewrapped into theTileCacheErrorfamily. Construction runs an O(N) orphan-file reconciliation scan against thetiles_dirand emits an INFOc6.store.constructlog with the steady-state row count and disk bytes.src/gps_denied_onboard/components/c6_tile_cache/tools.py— operator-side CLI (python -m gps_denied_onboard.components.c6_tile_cache.tools dump --zoom Z --lat LAT --lon LON [-o PATH]) that opens the production store viaload_config()+PostgresFilesystemStore.from_config(), reads the tile via the mmap handle, and writes the JPEG body to stdout or the supplied file. Intentionally no formal contract — thin shell overread_tile_pixels.
Modified (production)
src/gps_denied_onboard/components/c6_tile_cache/config.py— addedpostgres_pool_size: int = 4toC6TileCacheConfigwith> 0validation per AZ-305 scope.src/gps_denied_onboard/fdr_client/records.py— addedc6.write(tile_id, source, disk_bytes, content_sha256) andc6.write_failed(tile_id, source, reason, error_class, message) entries toKNOWN_PAYLOAD_KEYS. The parser is forward-compatible by design (unknown kinds parse opaquely), so v1.0 readers do not break — but the new entries put the new kinds on the validated / monitored hot path.src/gps_denied_onboard/runtime_root/storage_factory.py—build_tile_storeandbuild_tile_metadata_storenow dispatch viaPostgresFilesystemStore.from_config(config)so the runtime root no longer needs to know about pool / FdrClient / logger wiring.
Modified (tests)
tests/unit/c6_tile_cache/test_postgres_filesystem_store.py— NEW suite of 25 tests:- 5 non-docker unit tests for
MmapTilePixelHandle(read-only view, missing-fileTileFsError, empty-fileTileFsError),_quality_to_dictround-trip, and_row_to_metadataNULL-voting →TRUSTEDnormalisation. - 15
@pytest.mark.dockertests covering AC-1..AC-15 against a real Postgres +tmp_pathfilesystem. - 5 bonus tests covering
insert_metadatavalidation,get_by_idabsence, and per-flight separation via differentflight_ids.
- 5 non-docker unit tests for
tests/unit/c6_tile_cache/test_protocol_conformance.py— the AZ-303 fakePostgresFilesystemStorenow exposes afrom_configclassmethod so the factory dispatch keeps working; the AC-5 "module missing" branch is now exercised by patching the lazy import site to raiseModuleNotFoundError.tests/unit/test_az272_fdr_record_schema.py— added fixture payloads for the newc6.writeandc6.write_failedkinds so the per-kind round-trip test (AC-1 of AZ-272) covers them.
Modified (docs)
_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md— bumped to v1.1.0 (non-breaking, forward-compat); added rows for the two new kinds and a change-log entry.
Modified (build)
pyproject.toml— addedpsycopg-pool>=3.2,<4.0to dependencies (previously onlypsycopg[binary]was pinned; the impl needs the pool to amortise checkout latency on the F3 read path per Risk 3 of the AZ-305 spec).
Acceptance criteria coverage
| AC | Test | Status |
|---|---|---|
| AC-1 round-trip byte-identical | test_ac1_write_read_round_trip_byte_identical |
passing |
| AC-2 hash mismatch rejected before I/O | test_ac2_content_hash_mismatch_rejects_before_io |
passing |
| AC-3 duplicate key + compensating delete | test_ac3_duplicate_key_raises_metadata_error_with_compensating_delete |
passing |
| AC-4 row without file fails fast | test_ac4_row_without_file_raises_metadata_error |
passing |
| AC-5 bbox deterministic order | test_ac5_query_by_bbox_returns_deterministic_results |
passing |
| AC-6 bbox filters | test_ac6_query_by_bbox_honours_filters |
passing |
| AC-7 voting forward transitions | test_ac7_update_voting_status_enforces_forward_transitions |
passing |
| AC-8 mark_uploaded + pending_uploads | test_ac8_mark_uploaded_removes_from_pending |
passing |
| AC-9 LRU monotonic | test_ac9_record_lru_access_is_monotonic |
passing |
| AC-10 disk bytes excludes rejected | test_ac10_total_disk_bytes_excludes_rejected |
passing |
| AC-11 delete_tile idempotent | test_ac11_delete_tile_is_idempotent |
passing |
| AC-12 third-party errors rewrapped | test_ac12_third_party_exceptions_rewrapped |
passing |
| AC-13 warm read p95 budget | test_ac13_read_tile_pixels_warm_latency_p95 |
passing |
| AC-14 5 Hz write burst | test_ac14_write_tile_sustains_burst_without_drops |
passing |
| AC-15 FDR record on success/failure | test_ac15_fdr_record_on_write_success_and_failure |
passing |
AC Test Coverage: 15 of 15 covered
Code Review Verdict: PASS
Auto-Fix Attempts: 1 (ruff --fix; 22 of 22 findings auto-resolved) + 1 user-requested fix (AC-3 strict-reading)
Stuck Agents: None
Findings (self-review)
| # | Severity | Category | Location | Note | Resolution |
|---|---|---|---|---|---|
| 1 | Medium | Spec-Gap | postgres_filesystem_store.py::_write_tile_impl |
AC-3's strictest reading required the original row + file to be byte-identical after a duplicate-key collision. Original impl wrote the sidecar BEFORE the row insert, so a duplicate fired the comp-delete on the freshly overwritten file. | FIXED in this batch (user chose fix_now): _write_tile_impl was reordered — INSERT now runs first inside an open transaction; only on success does the atomic sidecar write touch the canonical path; the commit then closes the transaction. Duplicate-key collisions now raise TileMetadataError BEFORE any byte hits disk, leaving the original file untouched. Comp-delete is retained for the (extremely rare) commit-after-write-failure path. AC-3 test asserts the strict invariant: original file bytes + sidecar are byte-identical, and read_tile_pixels still returns the original blob_a. |
| 2 | Low | Maintainability | postgres_filesystem_store.py::_emit_write_failed |
The failure path calls self._tile_xy() to derive the canonical UUID for the FDR record. If _tile_xy() itself ever raises (it shouldn't — TileId.__post_init__ validates lat/lon at construction), the FDR record would be lost and the exception would mask the original write-time error. Pre-validation in TileId keeps this safe today; revisit when WgsConverter gains a per-call failure mode. |
Open (Low) — accepted as-is. |
| 3 | Low | Test-quality | test_ac13_read_tile_pixels_warm_latency_p95 |
The spec quotes a 0.5 ms p95 target with a 5 ms failure threshold. The test asserts only the failure threshold so it stays useful on a heterogeneous CI host; the soft 0.5 ms goal is tracked outside of this test (e.g., performance dashboards). | Open (Low) — accepted as-is. |
Tracker
- AZ-305 transitioned to In Progress on session start; will be moved to In Testing post-commit per
protocols.md.
Test suite
tests/unit/c6_tile_cache/(128 tests) — passing at Tier-2.- Full Tier-2 suite (
pytest tests/unit): 1215 passed, 8 skipped, 1 pre-existing failure (test_ac8_read_host_tuple_on_jetson— needspynvml, Jetson-only, unrelated to AZ-305 — confirmed pre-existing onbf33b94bygit stashround-trip).
Next batch
All AZ-305 work complete. Cycle 1 has no more remaining batches in the greenfield queue — autodev advances to the cycle-end gate (Step 7's batch-loop exit → Step 15 Product Implementation Completeness Gate, or the next sub-step the active flow defines).