[AZ-306] C6 FaissDescriptorIndex (faiss-cpu, HNSW32)

Production-default DescriptorIndex strategy backed by the faiss-cpu
PyPI wheel (>=1.7,<2.0). Implements the AZ-303 Protocol surface end
to end: HNSW32 + IndexIDMap2 search, atomic three-file rebuild
(.index + .sha256 sidecar + .meta.json), triple-consistency load
check, mmap-backed reads with IO_FLAG_MMAP|IO_FLAG_READ_ONLY, optional
warm-up query at construction, FAISS RuntimeError rewrap to
IndexUnavailableError / IndexBuildError, and FaissDescriptorIndex.from_config
classmethod wired into runtime_root.storage_factory.

The original spec required a custom pybind11 wrapper over a vendored
FAISS HEAD; the user opted for the upstream faiss-cpu wheel after
research fact #92 confirmed ARM64 wheel availability for Jetson and
the existing pyproject.toml already pinned faiss-cpu. cpp/faiss_index/
placeholder removed; BUILD_FAISS_INDEX flag retained as a
runtime/factory gate (no native target). Spec rewritten end-to-end and
archived to _docs/02_tasks/done/.

C6TileCacheConfig extended with faiss_index_path and
faiss_warmup_query_path fields. tests/conftest.py sets
KMP_DUPLICATE_LIB_OK=TRUE to remediate the macOS faiss/torch libomp
duplicate-load abort during pytest (no-op on CI Linux). 21 new tests
cover AC-1..12 + 2 NFRs + from_config smoke; AZ-303 protocol-conformance
fake updated with from_config classmethod.

Tests: 124/124 c6_tile_cache pass; 1334 project-wide pass; 1
pre-existing OKVIS2 submodule failure unrelated.

Doc sync: module-layout.md, components/08_c6_tile_cache/description.md
§5, batch_35_cycle1_report.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 04:01:37 +03:00
parent ecf76d762d
commit 3b7265757b
17 changed files with 1550 additions and 87 deletions
@@ -119,7 +119,7 @@ Not applicable — internal-only; C11 `TileUploader` reads via `TileStore` for u
|---------|---------|---------|
| PostgreSQL (server + libpq) | 16.x (mirror of `satellite-provider`'s pin) | Spatial metadata index |
| psycopg / asyncpg | per project pin | Python Postgres client |
| FAISS (Python + C++) | upstream HEAD pinned per Plan-phase | HNSW retrieval |
| faiss-cpu (PyPI wheel) | `>=1.7,<2.0` | HNSW retrieval (AZ-306 chose the upstream wheel over a custom pybind11 wrapper; runtime-gated by `BUILD_FAISS_INDEX` at the storage factory) |
| atomicwrites | latest | Atomic file replacement for `.index` rebuild (D-C10-3) |
| hashlib (stdlib) | stdlib | SHA-256 content-hash sidecars |
+3 -3
View File
@@ -146,7 +146,7 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
- Config block: `C6TileCacheConfig` (registered on import)
- Error family rooted at `TileCacheError` with documented subtypes (`TileNotFoundError`, `TileFsError`, `TileMetadataError`, `ContentHashMismatchError`, `FreshnessRejectionError`, `CacheBudgetExhaustedError`, `IndexUnavailableError`) + sibling `IndexBuildError` (offline build envelope, not in the `TileCacheError` family)
- **Internal**:
- `_native/` (FAISS C++ wrapper, planned)
- `faiss_descriptor_index.py` (AZ-306 — production-default `DescriptorIndex` strategy backed by the `faiss-cpu` PyPI wheel; HNSW32 search + atomic rebuild + triple-sidecar coherence + warm-up; `from_config` classmethod consumed by `runtime_root.storage_factory.build_descriptor_index`. Gated by `BUILD_FAISS_INDEX` at the factory boundary, NOT at module import.)
- `_tile_pixel_handle.py` (`TilePixelHandle` ABC)
- `_types.py` (DTOs / enums; consumed via the Public API re-exports)
- `_uuid_namespace.py` (AZ-304 — pinned `TILE_NAMESPACE_UUID` + `derive_tile_id` / `derive_location_hash` helpers; cross-repo coordinated with `satellite-provider`)
@@ -156,7 +156,7 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
- `cache_budget_enforcer.py` (AZ-308 — RESTRICT-SAT-2 10 GiB hard cap; `CacheBudgetEnforcer.reserve_headroom` + `BudgetEnforcedTileStore` write-decorator)
- `tools.py` (AZ-305 — operator dump CLI invoked via `python -m gps_denied_onboard.components.c6_tile_cache.tools ...`)
- `errors.py`, `config.py` (component plumbing)
- **Owns**: `src/gps_denied_onboard/components/c6_tile_cache/**`, `cpp/faiss_index/**`, `tests/unit/c6_tile_cache/**`, `db/migrations/**` (project-level Alembic env owned by c6 — `alembic.ini` at repo root points here; `0001_initial.py` shipped by AZ-263 bootstrap, `0002_c6_tile_identity_and_lru.py` and forward owned by AZ-304+ migrations)
- **Owns**: `src/gps_denied_onboard/components/c6_tile_cache/**`, `tests/unit/c6_tile_cache/**`, `db/migrations/**` (project-level Alembic env owned by c6 — `alembic.ini` at repo root points here; `0001_initial.py` shipped by AZ-263 bootstrap, `0002_c6_tile_identity_and_lru.py` and forward owned by AZ-304+ migrations). AZ-306 retired the `cpp/faiss_index/` placeholder in favour of the `faiss-cpu` PyPI wheel; the `BUILD_FAISS_INDEX` flag is preserved as a runtime/factory gate (consumed by `runtime_root.storage_factory`).
- **Imports from**: `_types`, `helpers.sha256_sidecar`, `helpers.wgs_converter`, `clock`, `config`, `logging`, `fdr_client`
- **Consumed by**: `c2_vpr`, `c2_5_rerank`, `c3_matcher`, `c10_provisioning`, `c11_tile_manager`, `runtime_root`
@@ -427,7 +427,7 @@ Four binaries are built from this codebase: **airborne** (Tier-1 + Tier-2 produc
| `BUILD_C11_TILE_MANAGER` | c11_tile_manager | OFF | OFF | ON | OFF |
| `BUILD_C12_OPERATOR_TOOLING` | c12_operator_tooling | OFF | OFF | ON | OFF |
| `BUILD_GTSAM_BINDINGS` | cpp/gtsam_bindings (used by c4_pose + c5_state) | ON | ON | OFF | ON |
| `BUILD_FAISS_INDEX` | cpp/faiss_index (used by c6_tile_cache) | ON | ON | ON | OFF (replay reads pre-built cache only) |
| `BUILD_FAISS_INDEX` | c6_tile_cache `FaissDescriptorIndex` (faiss-cpu wheel; runtime gate at `runtime_root.storage_factory` — no native target) | ON | ON | ON | OFF (replay reads pre-built cache only) |
| `BUILD_VIDEO_FILE_FRAME_SOURCE` | `frame_source/VideoFileFrameSource` (AZ-265) | OFF | OFF | OFF | ON |
| `BUILD_TLOG_REPLAY_ADAPTER` | `c8_fc_adapter/tlog_replay_adapter` (AZ-265) | OFF | OFF | OFF | ON |
| `BUILD_REPLAY_SINK_JSONL` | `c8_fc_adapter/replay_sink` (AZ-265) | OFF | OFF | OFF | ON |