# tile-cache-builder (AZ-407) Builds the `tile-cache-fixture` Docker volume from the 60 still-image satellite references in `_docs/00_problem/input_data/` plus the Derkachi route bbox. ## Output schema ``` tile-cache-fixture/ tiles///.jpg # tile JPEG body tiles///.json # per-tile sidecar (mirrors `tiles` row) manifest.csv # sorted manifest (9 columns) descriptors.index # FAISS HNSW32 index (omitted if faiss not available) ``` Manifest columns (per `_docs/00_problem/restrictions.md` § Satellite Imagery + `_docs/02_document/data_model.md` § 2.1): | Column | Type | Notes | |--------|------|-------| | `zoom_level` | int | Slippy/XYZ zoom | | `tile_x`, `tile_y` | int | Tile coords at the zoom | | `capture_date` | ISO-8601 date | Default `2025-11-01` (frozen so freshness gate treats as fresh) | | `source` | enum | `googlemaps` for real paired tiles, `stub` for D-PROJ-3 fallback | | `m_per_px` | float | `0.5` (≥ the AC-8.1 floor) | | `jpeg_path` | str | Relative path to the JPEG body | | `content_hash` | hex | SHA-256 of the JPEG bytes | | `provenance` | str | `paired_gmaps:AD000NNN`, `STUB`, or `STUB_BBOX:derkachi:lat,lon,lat,lon` | ## Reproducibility (AC-1) Two consecutive invocations from the same input produce a bit-identical output tree: * Input files iterated in lexicographic order * PIL JPEG encoded with `quality=85, optimize=False, progressive=False, subsampling=2` * Manifest rows sorted by `(zoom_level, tile_x, tile_y)` before CSV serialisation * FAISS index built single-threaded with `omp_set_num_threads(1)` and SHA-derived stub descriptors ## Provenance (AC-7) | Item | Source | License | |------|--------|---------| | Real tile bodies | `_docs/00_problem/input_data/AD*_gmaps.png` (2 paired references) | Project test fixture; safe to redistribute under this repo's license | | Stub tile bodies | Generated from `_stub_jpeg_bytes(seed)` (PIL solid-fill) | Fully synthetic; no third-party data | | Derkachi bbox tile | Synthetic placeholder until D-PROJ-3 lands | Fully synthetic | | FAISS index | SHA-derived stub vectors (not real VPR descriptors) | Fully synthetic | ## Usage ```bash # Production (Docker volume): e2e/fixtures/tile-cache-builder/build.sh # Local mode (used by AZ-407 unit test): e2e/fixtures/tile-cache-builder/build.sh --local /tmp/tile-cache-out ``` The unit test `e2e/_unit_tests/fixtures/test_tile_cache_builder.py` verifies AC-1 / AC-2 / AC-7 by invoking `builder.py` twice against a `tmp_path` and asserting the output is byte-identical. ## Notes on D-PROJ-3 When D-PROJ-3 supplies the production tile-corpus for the Derkachi sector, the stub tiles produced here (any row with `provenance = STUB`) should be replaced by real Suite Sat Service tiles for those footprints. The builder will then no longer fall back to `_stub_jpeg_bytes` — every still that lacks a paired `_gmaps.png` will draw from the real corpus instead. ## Owned by AZ-407 (this task). The FAISS-stub descriptor format will not be used in production; the production VPR pipeline (C2) emits real DINOv2 descriptors. The stub format is sufficient for AZ-407's reproducibility and schema contracts only.