Implements the public top-level F1 build orchestrator for E-C10 per contract v1.1.0. Composes EngineCompiler (AZ-321), DescriptorBatcher (AZ-322), and ManifestBuilder (AZ-323) into a single idempotent operation guarded by a fcntl-backed cache_root/.c10.lock and a post-build coverage walk. Adds: - CacheProvisionerImpl + FilelockFileLockFactory (provisioner.py) - BuildRequest/BuildReport/BuildOutcome/SectorClassification DTOs + FileLockFactory Protocol + replaced placeholder CacheProvisioner Protocol with v1.1.0 surface (interface.py) - C10ProvisionerConfig wired into C10ProvisioningConfig (config.py) - BuildLockHeldError + ManifestCoverageError (errors.py) - build_cache_provisioner composition root (c10_factory.py) - 18 tests covering AC-1..AC-16 + NFR-perf-coverage-walk - filelock>=3.13,<4.0 (single new third-party dep) Idempotence (CP-INV-1) reuses AZ-323's _compute_manifest_hash / _aggregate_tile_hash so the build-identity decision agrees byte-for- byte with the Manifest's recorded manifest_hash. Coverage rollback uses a .prev rename snapshot. Diagnostic compile_engines_for_corpus is lock-free per AC-10. Co-authored-by: Cursor <cursoragent@cursor.com>
8.5 KiB
Batch 37 — Cycle 1 Report
Date: 2026-05-13 Batch: 37 (single task — closes the C10 build-phase trilogy AZ-321/322/323/325) Tasks: AZ-325 (C10 CacheProvisioner orchestrator, 3pt) Status: complete; AZ-325 pending transition to "In Testing".
Scope
AZ-325 implements CacheProvisionerImpl — the public top-level F1 build
orchestrator for E-C10. It composes EngineCompiler (AZ-321),
DescriptorBatcher (AZ-322), and ManifestBuilder (AZ-323) into a
single idempotent operation guarded by a filesystem lockfile and a
post-build coverage walk.
This unblocks E-C12 OperatorTooling — c10 build becomes a one-liner —
and provides the final assembly point for D-C10-1 idempotence and
D-C10-3 ManifestCoverageError.
Architectural Decisions
1. Public surface lives in interface.py only
The contract _docs/02_document/contracts/c10_provisioning/cache_provisioner.md
v1.1.0 defines CacheProvisioner Protocol + BuildRequest /
BuildReport / BuildOutcome / SectorClassification DTOs +
FileLockFactory Protocol. These all live in interface.py — the
single public API surface for the component. The implementation
(provisioner.py) imports the Protocols and DTOs from there and
declares only the implementation classes in its own __all__. This
matches the pattern established by AZ-321 / AZ-323 / AZ-324.
2. Build-identity hash byte-aligned with AZ-323
AZ-325's idempotence check has to match the manifest_hash AZ-323 wrote
into the prior Manifest.json byte-for-byte. Re-implementing the hash
formula here would risk drift. We instead import AZ-323's existing
_compute_manifest_hash and _aggregate_tile_hash helpers directly and
reconstruct the inputs the helper needs from a combination of the new
BuildRequest (for tiles_coverage_sha256, calibration_sha256,
sector/bbox/zoom/origin/flight) and the prior Manifest's recorded
artifacts (engine SHA-256s, descriptor index SHA-256). The leading
underscore on the helpers is acknowledged technical debt — it remains
finding F1 from the batch 31–33 cumulative review, with a deferred
hygiene PBI to extract a shared _build_identity module after AZ-324
ships. The decision is documented inline in provisioner.py:43-50.
3. Idempotence path performs zero compile / embed / write work
CP-INV-1 + AC-2 are explicit: a warm idempotent re-run must result in
zero calls to compile_engines_for_corpus, zero calls to
populate_descriptors, zero calls to build_manifest, and the on-disk
Manifest.json must remain byte-identical (mtime unchanged). The
orchestrator never instantiates a write path before the idempotence
check returns — only tile_metadata_store.query_by_bbox (a read) +
Manifest.json parse + SHA-256 of calibration_path are touched. All
spies in the unit tests verify this.
4. Coverage rollback uses .prev snapshot, not in-memory bytes
_run_active_build snapshots the prior-good Manifest by renaming
Manifest.json → Manifest.json.prev BEFORE the active phases run.
Every error path (engine compile raise, descriptor batcher raise,
manifest builder raise, ManifestCoverageError) calls
_restore_prior_manifest which deletes the new partial Manifest and
renames .prev back. This guarantees CP-INV-2 (failed build leaves
cache no worse than at start) without holding bytes in memory across
the whole build.
5. Lockfile uses filelock package (fcntl-backed on POSIX)
The FileLockFactory Protocol is the seam; the default
FilelockFileLockFactory wraps filelock.FileLock (fcntl flock on
POSIX → kernel auto-releases on process exit, satisfying the SIGKILL
clause of AC-8; msvcrt locks on Windows). On acquisition timeout, the
wrapper re-raises as the contract's typed BuildLockHeldError.
Lockfile cleanup is best-effort — a leftover .c10.lock is harmless
(filelock re-uses the file on next acquisition); the kernel-level
advisory lock is what enforces mutual exclusion.
6. Diagnostic compile_engines_for_corpus is lock-free
AC-10 / CP-TC-11: the engine-only diagnostic passthrough does NOT
acquire the lockfile. Operators run this for hardware-change scenarios
where forcing a full transactional build would be overkill, and the
lock-free path keeps it from contending with a concurrently-held lock
from an unrelated build_cache_artifacts invocation (covered by
test_diagnostic_engine_compile_does_not_acquire_lock).
7. C10ProvisionerConfig lives at the top of C10ProvisioningConfig
The new config dataclass (coverage_strict, lock_timeout_s,
manifest_filename) is wired in as C10ProvisioningConfig.provisioner,
matching the existing manifest / engine_compiler sub-block pattern.
The composition root reads block.provisioner and passes it directly
into the orchestrator's constructor.
Files Changed
Production code (new)
src/gps_denied_onboard/components/c10_provisioning/provisioner.py—CacheProvisionerImpl(orchestrator) +_LockGuard+FilelockFileLockFactory.
Production code (modified)
pyproject.toml— addedfilelock>=3.13,<4.0(single new third-party dep, per task constraint).src/gps_denied_onboard/components/c10_provisioning/interface.py— replaced placeholderCacheProvisionerProtocol with v1.1.0 surface; addedBuildOutcome,BuildRequest,BuildReport,SectorClassification,FileLockFactory.src/gps_denied_onboard/components/c10_provisioning/errors.py— addedBuildLockHeldError,ManifestCoverageError.src/gps_denied_onboard/components/c10_provisioning/config.py— addedC10ProvisionerConfig+ integrated asC10ProvisioningConfig.provisionersub-block.src/gps_denied_onboard/components/c10_provisioning/__init__.py— re-exported new public symbols.src/gps_denied_onboard/runtime_root/c10_factory.py— addedbuild_cache_provisioner(config, *, engine_compiler, descriptor_batcher, manifest_builder, tile_metadata_store, host, precision, clock)composition-root factory.
Tests (new)
tests/unit/c10_provisioning/test_cache_provisioner.py— 18 tests covering AC-1..AC-16 + NFR-perf-coverage-walk +test_diagnostic_engine_compile_does_not_acquire_locksupplemental. AC-12 (cold-build benchmark) is wired withpytest.skip()— runs manually on Tier-1 GPU host only.
Test Results
- 17 / 17 AZ-325 tests pass; 1 GPU-only test skipped as expected.
- 80 / 80 targeted runs pass on
tests/unit/c10_provisioning/(excluding the pre-existing AZ-322 faiss-import failure) +tests/unit/composition_root/. - One pre-existing failure is unchanged from
HEAD:tests/unit/c10_provisioning/test_descriptor_batcher.py::test_ac6_descriptor_id_mapping_matches_az306_schemefails withModuleNotFoundError: No module named 'faiss'becausefaissis an optional Tier-1 dependency. Verified pre-existing bygit stash+ re-run onHEAD. Not introduced by AZ-325; tracked in_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.mdcontext.
Decisions Ledger
| Decision | Rationale |
|---|---|
Public surface centralised in interface.py |
Mirrors AZ-321 / AZ-323 / AZ-324; one source of truth for contract Protocols + DTOs |
| Idempotence uses AZ-323's private hash helpers | Byte-for-byte agreement with the on-disk manifest_hash; refactor deferred to a hygiene PBI |
.prev rollback over in-memory snapshot |
Lower memory pressure for large Manifests; rename is atomic |
filelock chosen over fasteners |
Already idiomatic for the project size; fcntl-backed; SIGKILL-safe |
| Diagnostic passthrough is lock-free | AC-10; operator-controlled engine-only re-compile must not contend with a held lock |
C10ProvisionerConfig is a sub-block of C10ProvisioningConfig |
Matches existing manifest / engine_compiler pattern; keeps the config tree shallow |
Notes
build_cache_provisioneris wired but no integration test exists yet for the full real-AZ-321/322/323 pipeline (requires GPU + FAISS + TRT). E2E coverage lands with AZ-326 (T5 orchestrator) which composes the provisioner into the operator CLI.- F1 from the batch 31–33 cumulative review (verifier importing private
helper from manifest_builder) carries over; AZ-325 also depends on
the same private helpers. The hygiene PBI to extract a shared
_build_identitymodule is intentionally deferred — both consumers (AZ-324 verifier + AZ-325 provisioner) need the same helper, and a single refactor PBI after AZ-326 is cleaner than re-touching each consumer twice. - The OKVIS2 cmake submodule failure (carryover from batch 35/36) remains and is independent of this batch.