Files
Oleksandr Bezdieniezhnykh f01a5058ab [AZ-322] C10 DescriptorBatcher (faiss-cpu, OOM halve-retry)
Implements the C10 internal phase that walks every C6 tile, embeds
through C2's backbone via the AZ-321-produced engine, and rebuilds
the AZ-306 FAISS HNSW index in one atomic write.

- DescriptorBatcher with halve-and-retry OOM recovery (default 1 retry)
- BackboneEmbedder Protocol + C7EngineBackboneEmbedder default impl
- DescriptorBatchError for OOM / dim-mismatch / missing-output failures
- Empty-corpus surfaces as outcome=failure with explicit hint to run C11
- Per-10% progress callback + DEBUG logs (no engine bytes leaked)
- Consumer-side Protocol cuts (TilesByBboxBatchQuery, TilePixelOpener,
  DescriptorIndexRebuilder) so c10 stays within AZ-270 lint
- runtime_root.c10_factory adds build_descriptor_batcher + three
  C6->C10 adapters
- 16 unit tests covering AC-1..AC-10 + 2 NFRs + 4 supplemental
  (Protocol conformance, query pass-through, handle release, config)

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 04:20:47 +03:00

7.9 KiB

Batch 36 — Cycle 1 Report

Date: 2026-05-13 Batch: 36 (single task — direct AZ-306 follow-up) Tasks: AZ-322 (C10 Descriptor Batcher, 3pt) Status: complete; AZ-322 transitioned to "In Testing" pending operator review.

Scope

AZ-322 implements DescriptorBatcher — the C10 phase that walks every C6 tile in the requested (bbox, zoom_levels, sector_class), embeds it through C2's VPR backbone (via the C7 engine produced by AZ-321), and rebuilds the AZ-306 FAISS HNSW index in one atomic write.

This unblocks the airborne C2 VPR step's takeoff verify (AC-NEW-1) and makes the C10-PT-01 cold-build budget observable end-to-end.

Architectural Decisions

1. Consumer-side Protocol cuts (AZ-270 / AZ-507 compliance)

The AZ-322 task spec listed direct C6 types (TileMetadataStore, TileStore, DescriptorIndex) in the DescriptorBatcher.__init__ signature. That contradicts AZ-270 (no cross-component imports inside components/*) and the AZ-507 cross-component contract surface rule. The established precedent — AZ-323's ManifestBuilder and AZ-324's ManifestVerifierImpl — declares consumer-side structural Protocol cuts locally inside the C10 module and lets the composition root (runtime_root.c10_factory) wire C6's concrete strategies in via thin adapters.

This batch follows that precedent. descriptor_batcher.py declares four local-to-C10 Protocols:

  • BackboneEmbedder (lifted to interface.py for re-use by future tasks)
  • TilesByBboxBatchQuery — narrower than C6's TileMetadataStore.query_by_bbox, accepts tuple[int, ...] of zooms instead of a single zoom
  • TilePixelOpener — narrower than C6's TileStore.read_tile_pixels(TileId); takes (zoom, lat, lon) and returns a context manager
  • DescriptorIndexRebuilder — narrower than C6's DescriptorIndex.rebuild_from_descriptors(descriptors, tile_ids: list[TileId], hnsw_params: HnswParams); takes tile_records: list[TileBboxRecord] plus individual HNSW kwargs

The matching adapters live in runtime_root/c10_factory.py:

  • c6_tile_metadata_store_to_tiles_batch_query — loops over zoom_levels, projects TileMetadata rows down to the four-field TileBboxRecord
  • c6_tile_store_to_pixel_opener — builds TileId and returns the C6 TilePixelHandle (already a context manager)
  • c6_descriptor_index_to_rebuilder — projects TileBboxRecordTileId and folds HNSW kwargs into HnswParams

2. C7EngineBackboneEmbedder adapter — Any-typed at the c7 boundary

The default BackboneEmbedder impl wraps an AZ-297 InferenceRuntime + an AZ-321-compiled EngineHandle. Importing those types — even under TYPE_CHECKING — fails the AZ-270 AST lint because the lint walks ast.ImportFrom nodes regardless of context. We therefore type the constructor parameters as Any and rely on structural duck-typing (inference_runtime.infer(handle, dict) -> dict). The composition root wires the concrete C7 runtime in.

3. JPEG → tensor preprocessing is injected, not owned

C7EngineBackboneEmbedder accepts a tile_decoder: Callable[[Any], np.ndarray] rather than hard-wiring OpenCV / Pillow / torchvision. Image preprocessing belongs to E-C2 (AZ-255); when it ships, the composition root injects a real decoder. Until then the adapter stays free of imaging-stack dependencies, keeping AZ-322's surface narrow and the test surface tiny.

4. Descriptor int64 id formula — reuse AZ-306, do not invent

DescriptorBatcher does NOT recompute the int64 id formula. It hands TileBboxRecord rows to the rebuilder; the rebuilder adapter projects to TileId; AZ-306's FaissDescriptorIndex.rebuild_from_descriptors uses the canonical tile_id_to_int64(TileId) helper. Test test_ac6_descriptor_id_mapping_matches_az306_scheme confirms by importing tile_id_to_int64 directly and asserting against the int.from_bytes(sha256("zoom|lat|lon").first8, "big", signed=True) formula.

Files Changed

Production code (new)

  • src/gps_denied_onboard/components/c10_provisioning/descriptor_batcher.pyDescriptorBatcher class + BatcherTile, TileBboxRecord, CorpusFilter, ProgressEvent, DescriptorBatchReport, BatcherOutcome, C10BatcherConfig DTOs + TilesByBboxBatchQuery, TilePixelOpener, DescriptorIndexRebuilder consumer Protocols.
  • src/gps_denied_onboard/components/c10_provisioning/c7_engine_embedder.pyC7EngineBackboneEmbedder adapter wrapping the AZ-297 InferenceRuntime surface; Any-typed to stay below the AZ-270 boundary.

Production code (modified)

  • src/gps_denied_onboard/components/c10_provisioning/interface.py — added BackboneEmbedder Protocol (embed_batch + descriptor_dim), runtime_checkable.
  • src/gps_denied_onboard/components/c10_provisioning/errors.py — added DescriptorBatchError exception class extending C10ProvisioningError.
  • src/gps_denied_onboard/components/c10_provisioning/__init__.py — re-exported all new symbols.
  • src/gps_denied_onboard/runtime_root/c10_factory.py — added build_descriptor_batcher plus the three C6→C10 adapter functions.

Tests (new)

  • tests/unit/c10_provisioning/test_descriptor_batcher.py — 16 tests covering AC-1 through AC-10 + NFR-perf-overhead + NFR-reliability-bounded-retry, plus 4 supplemental tests (Protocol runtime-check for the four consumer cuts, query-args pass-through, handle release on embed failure, config validation).

Documentation

  • _docs/02_document/module-layout.md — c10 Public API + Internal section updated to list the AZ-322 surface; composition root section lists the new factory + adapters.
  • _docs/02_document/components/11_c10_provisioning/description.md — §5 dependency table picks up numpy; new "AZ-322 internal phase" subsection summarises the batcher's contract / OOM behaviour / progress reporting / id formula.

Test Results

  • 16 / 16 AZ-322 tests pass (tests/unit/c10_provisioning/test_descriptor_batcher.py).
  • 197 / 197 c10 + c6 + runtime-root targeted runs pass (59 docker-skip).
  • Full project suite: 1352 passed, 79 skipped, 1 failed.
    • 79 skipped: docker / Jetson / CUDA / actionlint env-gated (Tier-0 dev host).
    • 1 failed: tests/unit/test_ac1_scaffold_layout.py::test_cmake_files_configure — pre-existing OKVIS2 git-submodule failure documented in batch_35 cycle report; unrelated to this batch.

Decisions Ledger

Decision Rationale
DescriptorBatcher.__init__ takes consumer-side Protocols, not raw C6 types AZ-270 lint blocks direct cross-component imports; AZ-323 / AZ-324 set the precedent
C7EngineBackboneEmbedder parameters are Any-typed AZ-270 AST lint flags TYPE_CHECKING imports too; structural duck-typing avoids the boundary
tile_decoder is injected, not bundled JPEG preprocessing belongs to E-C2 (AZ-255); keeping it out of AZ-322 narrows scope and dependencies
Default C10BatcherConfig.max_oom_retries=1 Spec NFR-reliability-bounded-retry; one halve from 64 → 32 is the standard surface, deeper retries mask GPU regressions
Reuse AZ-306's tile_id_to_int64 Spec AC-6; inventing the formula here would diverge from C6's id scheme
Atomic FAISS rebuild guaranteed by AZ-306, not duplicated here Spec AC-7; the batcher's role is to call rebuild_from_descriptors exactly once

Notes

  • The C7EngineBackboneEmbedder is the default BackboneEmbedder impl, but production wiring to a real C7 engine awaits AZ-326 (T5 orchestrator) and AZ-255 (real C2 backbone preprocessing). The adapter is unit-tested via fakes today; integration tests land with AZ-326.
  • C10BatcherConfig currently has no dedicated config-block hook in C10ProvisioningConfig; build_descriptor_batcher uses defaults. AZ-326 will add the config-block plumbing.
  • The OKVIS2 cmake submodule failure remains and is independent of every batch-35 / batch-36 change. It will resolve when the project's submodules are initialised on the dev host.