[AZ-321] C10 EngineCompiler: hardware-tied TRT compile + cache reuse

Land the C10 per-model engine compile + cache-reuse orchestrator.
`EngineCompiler.compile_engines_for_corpus(request)` walks the
corpus, computes the canonical engine filename via AZ-281
`EngineFilenameSchema.build`, and either reuses the cached binary
(cache hit, AZ-280 `Sha256Sidecar.verify` returns True) or delegates
to the AZ-297 `compile_engine` on the injected runtime (cache miss;
the runtime owns the write path). Returns one `EngineCompileResult`
per backbone carrying the canonical `EngineCacheEntry`, outcome
(BUILT / REUSED), and `compile_duration_s` (None on reuse).
Hardware-tied reuse (D-C10-6 / D-C10-7) falls out of the filename
schema — a host change rebuilds at the new path and leaves the old
files untouched (AC-4).

Design corrections vs. the task spec body:
- The spec proposed a c10-local `EngineCacheEntry` carrying outcome
  and duration; that name is already taken by the AZ-297 canonical
  DTO. The wrapper is renamed `EngineCompileResult`; the canonical
  shape wins.
- The spec called `InferenceRuntime.host_info()`, which is not in
  the AZ-297 Protocol. `HostCapabilities` is threaded through
  `EngineCompileRequest` instead so the composition root owns host
  probing and the compiler stays decoupled.
- The c10 layer cannot import `components.c7_inference` (arch rule
  `test_az270_compose_root.test_ac6`). `engine_compiler.py` defines
  `CompileEngineCallable` — a structural Protocol cut of
  `InferenceRuntime` exposing only `compile_engine` — and catches
  broad `Exception` (re-raising preserves the original type;
  `error_class` is recorded in the ERROR log payload).

Production
- engine_compiler.py: `CompileOutcome` enum, `BackboneSpec`,
  `EngineCompileRequest`, `EngineCompileResult`,
  `EngineCompileSummary` DTOs; `CompileEngineCallable` Protocol;
  `EngineCompiler` with the single public method.
- config.py: `BackboneConfig` + `C10ProvisioningConfig`
  (`workspace_mb` default 4 GiB to match C7 NFT-LIM-01); validate
  positive shape dims and duplicate model_name detection in
  `__post_init__`.
- runtime_root/c10_factory.py: `build_engine_compiler(config)` wires
  the existing `build_inference_runtime` factory through;
  `build_backbone_specs(config)` materialises the `BackboneSpec`
  tuple from the config block.
- components/c10_provisioning/__init__.py: re-exports the AZ-321
  surface and registers the new config block.

Tests
- test_engine_compiler.py: covers AC-1..AC-10 + missing-sidecar
  sibling case for AC-5. Tier-1 via fake runtime that writes through
  the REAL `Sha256Sidecar.write_atomic_and_sidecar`. Tier-2
  placeholders for the cache-hit p99 NFR (200 MB engine sweep) and
  kill-during-compile atomic-write NFR.

Docs
- module-layout.md: c10_provisioning Per-Component Mapping lists the
  new internal modules (engine_compiler.py, config.py), the
  composition-root c10_factory.py, the AZ-321 public re-export
  surface, and the registered config block.
- batch_33_cycle1_report.md + reviews/batch_33_review.md:
  PASS_WITH_WARNINGS (4 Low findings accepted).

Tests run: c10_provisioning 13 passing + 2 Tier-2 skips; combined
unit suite (excluding pending components) 543 passing, 21
env-skipped.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 00:09:53 +03:00
parent 0ad3278b12
commit 0dfe7c5301
10 changed files with 1538 additions and 7 deletions
+6 -2
View File
@@ -208,10 +208,14 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
- **Epic**: AZ-252 (E-C10 Cache Provisioner)
- **Directory**: `src/gps_denied_onboard/components/c10_provisioning/`
- **Public API**:
- `__init__.py` (re-exports `CacheProvisioner`, `Manifest`, `EngineCacheEntry`)
- `__init__.py` (re-exports `CacheProvisioner`, `Manifest`, `EngineCacheEntry`, plus AZ-321 surface: `EngineCompiler`, `BackboneSpec`, `EngineCompileRequest`, `EngineCompileResult`, `CompileOutcome`, `EngineCompileSummary`, `CompileEngineCallable`, `BackboneConfig`, `C10ProvisioningConfig`)
- `interface.py` (`CacheProvisioner` Protocol)
- Config block: `C10ProvisioningConfig` (registered on import)
- **Internal**:
- `default_provisioner.py` (engine compile + descriptors + manifest + content-hash gate)
- `engine_compiler.py` (AZ-321; per-model TRT compile + hardware-tied cache reuse + `CompileEngineCallable` structural cut of the C7 InferenceRuntime)
- `config.py` (AZ-321; `BackboneConfig` + `C10ProvisioningConfig` dataclasses)
- `default_provisioner.py` (engine compile + descriptors + manifest + content-hash gate, pending)
- Composition root: `runtime_root/c10_factory.py` (`build_engine_compiler`, `build_backbone_specs`)
- **Owns**: `src/gps_denied_onboard/components/c10_provisioning/**`, `tests/unit/c10_provisioning/**`
- **Imports from**: `_types`, `helpers.sha256_sidecar`, `helpers.engine_filename_schema`, `helpers.wgs_converter`, `components.c6_tile_cache` (Public API), `components.c7_inference` (Public API: engine compile surface), `config`, `logging`, `fdr_client`
- **Consumed by**: `c12_operator_tooling`, `runtime_root` (operator binary only — excluded from airborne via `BUILD_C10_PROVISIONING=OFF` for airborne build per ADR-002)