[AZ-322] C10 DescriptorBatcher (faiss-cpu, OOM halve-retry)

Implements the C10 internal phase that walks every C6 tile, embeds
through C2's backbone via the AZ-321-produced engine, and rebuilds
the AZ-306 FAISS HNSW index in one atomic write.

- DescriptorBatcher with halve-and-retry OOM recovery (default 1 retry)
- BackboneEmbedder Protocol + C7EngineBackboneEmbedder default impl
- DescriptorBatchError for OOM / dim-mismatch / missing-output failures
- Empty-corpus surfaces as outcome=failure with explicit hint to run C11
- Per-10% progress callback + DEBUG logs (no engine bytes leaked)
- Consumer-side Protocol cuts (TilesByBboxBatchQuery, TilePixelOpener,
  DescriptorIndexRebuilder) so c10 stays within AZ-270 lint
- runtime_root.c10_factory adds build_descriptor_batcher + three
  C6->C10 adapters
- 16 unit tests covering AC-1..AC-10 + 2 NFRs + 4 supplemental
  (Protocol conformance, query pass-through, handle release, config)

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 04:20:47 +03:00
parent 3b7265757b
commit f01a5058ab
12 changed files with 1733 additions and 10 deletions
@@ -100,6 +100,24 @@ C10 reads `tiles` rows from C6 (scoped to the build's bbox + zoom_levels), write
| atomicwrites | latest | Atomic file replacement for `.index` + Manifest (D-C10-3) |
| hashlib (stdlib) | stdlib | SHA-256 content-hash sidecars |
| PyYAML / orjson | per project pin | Manifest serialization |
| numpy | per project pin | Descriptor batch ndarray container (AZ-322 `DescriptorBatcher`) |
**AZ-322 internal phase — `DescriptorBatcher`**:
The `populate_descriptors` phase walks every tile in C6 for the requested
`(bbox, zoom_levels, sector_class)`, embeds them through C7's `InferenceRuntime`
(via `C7EngineBackboneEmbedder`, the default `BackboneEmbedder` impl), and
hands the resulting `(N, descriptor_dim)` ndarray to AZ-306's
`DescriptorIndex.rebuild_from_descriptors` for atomic FAISS index write.
CUDA OOM is handled via halve-and-retry bounded by `C10BatcherConfig.max_oom_retries`
(default 1: 64 → 32, then succeed-or-fail-fast) so a real GPU regression
surfaces in seconds rather than via silent retries. Per-10% progress is
emitted both as DEBUG logs (`c10.descriptor.progress`) and via an optional
`progress_callback` so operator tooling can wire a TTY/GUI bar without
touching the batcher itself. The descriptor int64 id formula is the
canonical AZ-306 scheme (`int.from_bytes(sha256("zoom|lat|lon").first8, "big", signed=True)`)
— invented locally to avoid a circular dependency back into C6 internals
would break AC-6.
**Error Handling Strategy**: