mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 08:21:14 +00:00
[AZ-305] c6 PostgresFilesystemStore: TileStore + TileMetadataStore impl
Adds the production PostgresFilesystemStore implementing both protocols in a single class. Filesystem-backed JPEG I/O (atomic sidecar write, read-only mmap) + Postgres-backed metadata (spatial bbox, LRU, voting, upload bookkeeping). Wires composition via `from_config` classmethod. Key behaviors: - AC-3 strict reading: INSERT runs first inside an open transaction; duplicate-key collisions raise `TileMetadataError` BEFORE any byte is written, leaving the original file + sidecar byte-identical. Atomic sidecar write happens inside the same transaction; commit closes it. Comp-delete remains as a safety net for the rare commit-after-write failure path. - AC-2 content-hash gate runs before any I/O. - Construction performs an orphan-file reconciliation scan and emits an INFO `c6.store.construct` log with steady-state stats. Adds `c6.write` and `c6.write_failed` FDR record kinds (schema v1.1.0, forward-compatible) and a thin operator CLI at `c6_tile_cache.tools dump` for inspection. Dependencies: adds `psycopg-pool>=3.2,<4.0` for the connection pool used on the F3 read-hot path. Tests: 25 new tests for c6_tile_cache cover AC-1..AC-15 plus MmapTilePixelHandle + helper round-trips. Full Tier-2 unit suite passes (1215 passed, 8 skipped, 1 pre-existing unrelated failure `test_ac8_read_host_tuple_on_jetson` — missing `pynvml` on macOS, Jetson-only). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,151 @@
|
||||
# Batch 28 / Cycle 1 — Implementation Report
|
||||
|
||||
**Date**: 2026-05-12
|
||||
**Tasks**: AZ-305 (C6 PostgresFilesystemStore — TileStore + TileMetadataStore production impl)
|
||||
**Story points landed**: 5
|
||||
**Status**: complete (AZ-305 → In Testing)
|
||||
|
||||
## Scope summary
|
||||
|
||||
Single-task batch landing the production `PostgresFilesystemStore` — the
|
||||
single class that satisfies BOTH `TileStore` (filesystem-backed JPEG I/O
|
||||
byte-identical to `satellite-provider`) and `TileMetadataStore`
|
||||
(Postgres-backed spatial / LRU / voting state). Owns the full insert
|
||||
path (atomic-write + SHA-256 sidecar via AZ-280, content-hash gate,
|
||||
single-transaction row insert, compensating delete on failure), the read
|
||||
path (`MmapTilePixelHandle` read-only mmap, btree-indexed bbox query,
|
||||
LRU access stamp), and bookkeeping (`mark_uploaded`,
|
||||
`update_voting_status`, `lru_candidates`, `total_disk_bytes`). Wires the
|
||||
freshness-gate call site (pass-through hook for AZ-307 to replace) and
|
||||
exposes the LRU primitives AZ-308 will consume.
|
||||
|
||||
The class is invoked from `storage_factory` via a new `from_config`
|
||||
classmethod that resolves the `psycopg_pool.ConnectionPool`, the
|
||||
producer-local `FdrClient` (via `make_fdr_client`), and the project
|
||||
logger. `__init__` itself takes explicit injected dependencies so unit
|
||||
tests can substitute the `FakeFdrSink`, a `tmp_path` root, and a
|
||||
test-managed pool without touching the composition root.
|
||||
|
||||
## Files added / modified
|
||||
|
||||
### New (production)
|
||||
|
||||
- `src/gps_denied_onboard/components/c6_tile_cache/postgres_filesystem_store.py`
|
||||
— `MmapTilePixelHandle` (read-only `PROT_READ` mmap returning a
|
||||
`.toreadonly()` `memoryview`); `PostgresFilesystemStore` with explicit
|
||||
dependency-injected `__init__` and a `from_config` classmethod for the
|
||||
composition root. Implements `read_tile_pixels`, `write_tile`,
|
||||
`tile_exists`, `delete_tile`, `query_by_bbox`, `insert_metadata`,
|
||||
`update_voting_status`, `mark_uploaded`, `pending_uploads`,
|
||||
`record_lru_access`, `lru_candidates`, `total_disk_bytes`,
|
||||
`get_by_id`. All third-party exceptions (`psycopg.Error`, `OSError`,
|
||||
`Sha256SidecarError`) are rewrapped into the `TileCacheError` family.
|
||||
Construction runs an O(N) orphan-file reconciliation scan against the
|
||||
`tiles_dir` and emits an INFO `c6.store.construct` log with the
|
||||
steady-state row count and disk bytes.
|
||||
- `src/gps_denied_onboard/components/c6_tile_cache/tools.py` —
|
||||
operator-side CLI (`python -m gps_denied_onboard.components.c6_tile_cache.tools dump --zoom Z --lat LAT --lon LON [-o PATH]`)
|
||||
that opens the production store via `load_config()` +
|
||||
`PostgresFilesystemStore.from_config()`, reads the tile via the mmap
|
||||
handle, and writes the JPEG body to stdout or the supplied file.
|
||||
Intentionally no formal contract — thin shell over `read_tile_pixels`.
|
||||
|
||||
### Modified (production)
|
||||
|
||||
- `src/gps_denied_onboard/components/c6_tile_cache/config.py` — added
|
||||
`postgres_pool_size: int = 4` to `C6TileCacheConfig` with `> 0`
|
||||
validation per AZ-305 scope.
|
||||
- `src/gps_denied_onboard/fdr_client/records.py` — added
|
||||
`c6.write` (`tile_id, source, disk_bytes, content_sha256`) and
|
||||
`c6.write_failed` (`tile_id, source, reason, error_class, message`)
|
||||
entries to `KNOWN_PAYLOAD_KEYS`. The parser is forward-compatible
|
||||
by design (unknown kinds parse opaquely), so v1.0 readers do not
|
||||
break — but the new entries put the new kinds on the validated /
|
||||
monitored hot path.
|
||||
- `src/gps_denied_onboard/runtime_root/storage_factory.py` —
|
||||
`build_tile_store` and `build_tile_metadata_store` now dispatch via
|
||||
`PostgresFilesystemStore.from_config(config)` so the runtime root no
|
||||
longer needs to know about pool / FdrClient / logger wiring.
|
||||
|
||||
### Modified (tests)
|
||||
|
||||
- `tests/unit/c6_tile_cache/test_postgres_filesystem_store.py` —
|
||||
**NEW** suite of 25 tests:
|
||||
- 5 non-docker unit tests for `MmapTilePixelHandle` (read-only view,
|
||||
missing-file `TileFsError`, empty-file `TileFsError`),
|
||||
`_quality_to_dict` round-trip, and `_row_to_metadata` NULL-voting →
|
||||
`TRUSTED` normalisation.
|
||||
- 15 `@pytest.mark.docker` tests covering AC-1..AC-15 against a
|
||||
real Postgres + `tmp_path` filesystem.
|
||||
- 5 bonus tests covering `insert_metadata` validation, `get_by_id`
|
||||
absence, and per-flight separation via different `flight_id`s.
|
||||
- `tests/unit/c6_tile_cache/test_protocol_conformance.py` — the AZ-303
|
||||
fake `PostgresFilesystemStore` now exposes a `from_config` classmethod
|
||||
so the factory dispatch keeps working; the AC-5 "module missing"
|
||||
branch is now exercised by patching the lazy import site to raise
|
||||
`ModuleNotFoundError`.
|
||||
- `tests/unit/test_az272_fdr_record_schema.py` — added fixture payloads
|
||||
for the new `c6.write` and `c6.write_failed` kinds so the per-kind
|
||||
round-trip test (AC-1 of AZ-272) covers them.
|
||||
|
||||
### Modified (docs)
|
||||
|
||||
- `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` —
|
||||
bumped to v1.1.0 (non-breaking, forward-compat); added rows for the
|
||||
two new kinds and a change-log entry.
|
||||
|
||||
### Modified (build)
|
||||
|
||||
- `pyproject.toml` — added `psycopg-pool>=3.2,<4.0` to dependencies
|
||||
(previously only `psycopg[binary]` was pinned; the impl needs the
|
||||
pool to amortise checkout latency on the F3 read path per Risk 3 of
|
||||
the AZ-305 spec).
|
||||
|
||||
## Acceptance criteria coverage
|
||||
|
||||
| AC | Test | Status |
|
||||
|----|------|--------|
|
||||
| AC-1 round-trip byte-identical | `test_ac1_write_read_round_trip_byte_identical` | passing |
|
||||
| AC-2 hash mismatch rejected before I/O | `test_ac2_content_hash_mismatch_rejects_before_io` | passing |
|
||||
| AC-3 duplicate key + compensating delete | `test_ac3_duplicate_key_raises_metadata_error_with_compensating_delete` | passing |
|
||||
| AC-4 row without file fails fast | `test_ac4_row_without_file_raises_metadata_error` | passing |
|
||||
| AC-5 bbox deterministic order | `test_ac5_query_by_bbox_returns_deterministic_results` | passing |
|
||||
| AC-6 bbox filters | `test_ac6_query_by_bbox_honours_filters` | passing |
|
||||
| AC-7 voting forward transitions | `test_ac7_update_voting_status_enforces_forward_transitions` | passing |
|
||||
| AC-8 mark_uploaded + pending_uploads | `test_ac8_mark_uploaded_removes_from_pending` | passing |
|
||||
| AC-9 LRU monotonic | `test_ac9_record_lru_access_is_monotonic` | passing |
|
||||
| AC-10 disk bytes excludes rejected | `test_ac10_total_disk_bytes_excludes_rejected` | passing |
|
||||
| AC-11 delete_tile idempotent | `test_ac11_delete_tile_is_idempotent` | passing |
|
||||
| AC-12 third-party errors rewrapped | `test_ac12_third_party_exceptions_rewrapped` | passing |
|
||||
| AC-13 warm read p95 budget | `test_ac13_read_tile_pixels_warm_latency_p95` | passing |
|
||||
| AC-14 5 Hz write burst | `test_ac14_write_tile_sustains_burst_without_drops` | passing |
|
||||
| AC-15 FDR record on success/failure | `test_ac15_fdr_record_on_write_success_and_failure` | passing |
|
||||
|
||||
## AC Test Coverage: 15 of 15 covered
|
||||
## Code Review Verdict: PASS
|
||||
## Auto-Fix Attempts: 1 (ruff `--fix`; 22 of 22 findings auto-resolved) + 1 user-requested fix (AC-3 strict-reading)
|
||||
## Stuck Agents: None
|
||||
|
||||
## Findings (self-review)
|
||||
|
||||
| # | Severity | Category | Location | Note | Resolution |
|
||||
|---|----------|----------|----------|------|------------|
|
||||
| 1 | Medium | Spec-Gap | `postgres_filesystem_store.py::_write_tile_impl` | AC-3's strictest reading required the original row + file to be byte-identical after a duplicate-key collision. Original impl wrote the sidecar BEFORE the row insert, so a duplicate fired the comp-delete on the freshly overwritten file. | **FIXED** in this batch (user chose `fix_now`): `_write_tile_impl` was reordered — INSERT now runs first inside an open transaction; only on success does the atomic sidecar write touch the canonical path; the commit then closes the transaction. Duplicate-key collisions now raise `TileMetadataError` BEFORE any byte hits disk, leaving the original file untouched. Comp-delete is retained for the (extremely rare) commit-after-write-failure path. AC-3 test asserts the strict invariant: original file bytes + sidecar are byte-identical, and `read_tile_pixels` still returns the original `blob_a`. |
|
||||
| 2 | Low | Maintainability | `postgres_filesystem_store.py::_emit_write_failed` | The failure path calls `self._tile_xy()` to derive the canonical UUID for the FDR record. If `_tile_xy()` itself ever raises (it shouldn't — `TileId.__post_init__` validates lat/lon at construction), the FDR record would be lost and the exception would mask the original write-time error. Pre-validation in `TileId` keeps this safe today; revisit when `WgsConverter` gains a per-call failure mode. | Open (Low) — accepted as-is. |
|
||||
| 3 | Low | Test-quality | `test_ac13_read_tile_pixels_warm_latency_p95` | The spec quotes a 0.5 ms p95 target with a 5 ms failure threshold. The test asserts only the failure threshold so it stays useful on a heterogeneous CI host; the soft 0.5 ms goal is tracked outside of this test (e.g., performance dashboards). | Open (Low) — accepted as-is. |
|
||||
|
||||
## Tracker
|
||||
|
||||
- AZ-305 transitioned to **In Progress** on session start; will be moved to **In Testing** post-commit per `protocols.md`.
|
||||
|
||||
## Test suite
|
||||
|
||||
- `tests/unit/c6_tile_cache/` (128 tests) — passing at Tier-2.
|
||||
- Full Tier-2 suite (`pytest tests/unit`): 1215 passed, 8 skipped, 1 pre-existing failure (`test_ac8_read_host_tuple_on_jetson` — needs `pynvml`, Jetson-only, unrelated to AZ-305 — confirmed pre-existing on `bf33b94` by `git stash` round-trip).
|
||||
|
||||
## Next batch
|
||||
|
||||
All AZ-305 work complete. Cycle 1 has no more remaining batches in the
|
||||
greenfield queue — autodev advances to the cycle-end gate (Step 7's
|
||||
batch-loop exit → Step 15 Product Implementation Completeness Gate, or
|
||||
the next sub-step the active flow defines).
|
||||
Reference in New Issue
Block a user