mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 09:51:13 +00:00
[AZ-321] C10 EngineCompiler: hardware-tied TRT compile + cache reuse
Land the C10 per-model engine compile + cache-reuse orchestrator. `EngineCompiler.compile_engines_for_corpus(request)` walks the corpus, computes the canonical engine filename via AZ-281 `EngineFilenameSchema.build`, and either reuses the cached binary (cache hit, AZ-280 `Sha256Sidecar.verify` returns True) or delegates to the AZ-297 `compile_engine` on the injected runtime (cache miss; the runtime owns the write path). Returns one `EngineCompileResult` per backbone carrying the canonical `EngineCacheEntry`, outcome (BUILT / REUSED), and `compile_duration_s` (None on reuse). Hardware-tied reuse (D-C10-6 / D-C10-7) falls out of the filename schema — a host change rebuilds at the new path and leaves the old files untouched (AC-4). Design corrections vs. the task spec body: - The spec proposed a c10-local `EngineCacheEntry` carrying outcome and duration; that name is already taken by the AZ-297 canonical DTO. The wrapper is renamed `EngineCompileResult`; the canonical shape wins. - The spec called `InferenceRuntime.host_info()`, which is not in the AZ-297 Protocol. `HostCapabilities` is threaded through `EngineCompileRequest` instead so the composition root owns host probing and the compiler stays decoupled. - The c10 layer cannot import `components.c7_inference` (arch rule `test_az270_compose_root.test_ac6`). `engine_compiler.py` defines `CompileEngineCallable` — a structural Protocol cut of `InferenceRuntime` exposing only `compile_engine` — and catches broad `Exception` (re-raising preserves the original type; `error_class` is recorded in the ERROR log payload). Production - engine_compiler.py: `CompileOutcome` enum, `BackboneSpec`, `EngineCompileRequest`, `EngineCompileResult`, `EngineCompileSummary` DTOs; `CompileEngineCallable` Protocol; `EngineCompiler` with the single public method. - config.py: `BackboneConfig` + `C10ProvisioningConfig` (`workspace_mb` default 4 GiB to match C7 NFT-LIM-01); validate positive shape dims and duplicate model_name detection in `__post_init__`. - runtime_root/c10_factory.py: `build_engine_compiler(config)` wires the existing `build_inference_runtime` factory through; `build_backbone_specs(config)` materialises the `BackboneSpec` tuple from the config block. - components/c10_provisioning/__init__.py: re-exports the AZ-321 surface and registers the new config block. Tests - test_engine_compiler.py: covers AC-1..AC-10 + missing-sidecar sibling case for AC-5. Tier-1 via fake runtime that writes through the REAL `Sha256Sidecar.write_atomic_and_sidecar`. Tier-2 placeholders for the cache-hit p99 NFR (200 MB engine sweep) and kill-during-compile atomic-write NFR. Docs - module-layout.md: c10_provisioning Per-Component Mapping lists the new internal modules (engine_compiler.py, config.py), the composition-root c10_factory.py, the AZ-321 public re-export surface, and the registered config block. - batch_33_cycle1_report.md + reviews/batch_33_review.md: PASS_WITH_WARNINGS (4 Low findings accepted). Tests run: c10_provisioning 13 passing + 2 Tier-2 skips; combined unit suite (excluding pending components) 543 passing, 21 env-skipped. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -208,10 +208,14 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
|
||||
- **Epic**: AZ-252 (E-C10 Cache Provisioner)
|
||||
- **Directory**: `src/gps_denied_onboard/components/c10_provisioning/`
|
||||
- **Public API**:
|
||||
- `__init__.py` (re-exports `CacheProvisioner`, `Manifest`, `EngineCacheEntry`)
|
||||
- `__init__.py` (re-exports `CacheProvisioner`, `Manifest`, `EngineCacheEntry`, plus AZ-321 surface: `EngineCompiler`, `BackboneSpec`, `EngineCompileRequest`, `EngineCompileResult`, `CompileOutcome`, `EngineCompileSummary`, `CompileEngineCallable`, `BackboneConfig`, `C10ProvisioningConfig`)
|
||||
- `interface.py` (`CacheProvisioner` Protocol)
|
||||
- Config block: `C10ProvisioningConfig` (registered on import)
|
||||
- **Internal**:
|
||||
- `default_provisioner.py` (engine compile + descriptors + manifest + content-hash gate)
|
||||
- `engine_compiler.py` (AZ-321; per-model TRT compile + hardware-tied cache reuse + `CompileEngineCallable` structural cut of the C7 InferenceRuntime)
|
||||
- `config.py` (AZ-321; `BackboneConfig` + `C10ProvisioningConfig` dataclasses)
|
||||
- `default_provisioner.py` (engine compile + descriptors + manifest + content-hash gate, pending)
|
||||
- Composition root: `runtime_root/c10_factory.py` (`build_engine_compiler`, `build_backbone_specs`)
|
||||
- **Owns**: `src/gps_denied_onboard/components/c10_provisioning/**`, `tests/unit/c10_provisioning/**`
|
||||
- **Imports from**: `_types`, `helpers.sha256_sidecar`, `helpers.engine_filename_schema`, `helpers.wgs_converter`, `components.c6_tile_cache` (Public API), `components.c7_inference` (Public API: engine compile surface), `config`, `logging`, `fdr_client`
|
||||
- **Consumed by**: `c12_operator_tooling`, `runtime_root` (operator binary only — excluded from airborne via `BUILD_C10_PROVISIONING=OFF` for airborne build per ADR-002)
|
||||
|
||||
@@ -0,0 +1,218 @@
|
||||
# Batch 33 / Cycle 1 — Implementation Report
|
||||
|
||||
**Date**: 2026-05-12
|
||||
**Tasks**: AZ-321 (C10 EngineCompiler — per-model TRT compile +
|
||||
hardware-tied cache reuse + AZ-281 filename schema + AZ-280 sidecar
|
||||
gate)
|
||||
**Story points landed**: 5
|
||||
**Status**: complete (AZ-321 → In Testing)
|
||||
|
||||
## Scope summary
|
||||
|
||||
Single-task batch landing the C10 per-model engine compile + cache-
|
||||
reuse orchestrator. `EngineCompiler.compile_engines_for_corpus(req)`
|
||||
walks the corpus, computes the canonical engine filename via AZ-281
|
||||
`EngineFilenameSchema.build(...)`, and either reuses the cached
|
||||
binary (cache hit; AZ-280 `Sha256Sidecar.verify` returns True) or
|
||||
delegates to the AZ-297 `compile_engine` method on the injected
|
||||
runtime (cache miss; the runtime owns the write path and the sidecar
|
||||
emission). The orchestrator returns one `EngineCompileResult` per
|
||||
backbone carrying the canonical `EngineCacheEntry`, the
|
||||
`CompileOutcome.{BUILT,REUSED}` label, and the `compile_duration_s`
|
||||
(None on reuse). Hardware-tied cache reuse (D-C10-6 / D-C10-7) falls
|
||||
out naturally from the filename schema: an engine compiled on
|
||||
`(sm=87, jp=6.2, trt=10.3, fp16)` lives at a different path than one
|
||||
compiled on `(sm=89, jp=6.3, trt=10.5, fp16)`, so a hardware change
|
||||
produces cache misses for the new device and leaves the old files
|
||||
untouched (AC-4).
|
||||
|
||||
Two design corrections vs. the task spec body:
|
||||
|
||||
- **`EngineCacheEntry` shape** — the task spec proposed a c10-local
|
||||
`EngineCacheEntry` with `outcome` and `compile_duration_s` fields.
|
||||
That clashes with the canonical AZ-297
|
||||
`_types.inference.EngineCacheEntry` already re-exported from
|
||||
`components.c10_provisioning`. The canonical shape wins; the AZ-321
|
||||
wrapper is renamed `EngineCompileResult` and carries
|
||||
`{entry, outcome, compile_duration_s}` cleanly.
|
||||
- **`InferenceRuntime.host_info()`** — the task spec calls a
|
||||
hypothetical `host_info()` method on the runtime to retrieve
|
||||
`(sm, jp, trt)`. The AZ-297 Protocol does NOT expose host info.
|
||||
Rather than expand the frozen Protocol mid-cycle, we accept a
|
||||
`HostCapabilities` field on `EngineCompileRequest` so the
|
||||
composition root threads the host from its own probe (Tier-2
|
||||
device introspection or Tier-1 test fixture). The compiler stays
|
||||
decoupled from any runtime-side introspection surface.
|
||||
|
||||
The C10 layer is also forbidden by `test_az270_compose_root.test_ac6`
|
||||
from importing from `components.c7_inference` directly — that rule
|
||||
applies across all `components/*/*.py` files regardless of what the
|
||||
prose-level `module-layout.md` declares the "Imports from" list to be.
|
||||
The lint test wins. To respect it, `engine_compiler.py` defines
|
||||
`CompileEngineCallable` — a structural Protocol cut of
|
||||
`InferenceRuntime` exposing only the single `compile_engine` method
|
||||
the compiler actually uses — and catches the broader `Exception`
|
||||
class (the AZ-297 C7 error family stays the runtime's contract; the
|
||||
compiler dispatches on `type(exc).__name__` in its ERROR log payload
|
||||
and re-raises so the original exception type propagates to the
|
||||
caller intact).
|
||||
|
||||
## Files added / modified
|
||||
|
||||
### New (production)
|
||||
|
||||
- `src/gps_denied_onboard/components/c10_provisioning/engine_compiler.py`
|
||||
— `CompileOutcome` enum (`BUILT` / `REUSED`), `BackboneSpec` DTO,
|
||||
`EngineCompileRequest` DTO, `EngineCompileResult` DTO,
|
||||
`EngineCompileSummary` DTO, `CompileEngineCallable` structural
|
||||
Protocol, and the `EngineCompiler` class with the single public
|
||||
`compile_engines_for_corpus` method. Helpers:
|
||||
`_build_config_for_backbone` (synthesises one
|
||||
`OptimizationProfile` with `min == opt == max ==
|
||||
expected_input_shape` from the backbone spec; richer dynamic-shape
|
||||
ranges are out of scope for AZ-321), `_summarise` (aggregate counts
|
||||
for the `c10.engine.compile.summary` log).
|
||||
- `src/gps_denied_onboard/components/c10_provisioning/config.py` —
|
||||
`BackboneConfig` DTO (`model_name`, `onnx_path`,
|
||||
`expected_input_shape`, `input_name` with `"input"` default) +
|
||||
`C10ProvisioningConfig` (`backbones` tuple, `workspace_mb` default
|
||||
4096 to match C7 NFT-LIM-01). Both validate in `__post_init__`
|
||||
(non-empty strings, positive shape dims, duplicate model_name
|
||||
detection).
|
||||
- `src/gps_denied_onboard/runtime_root/c10_factory.py` —
|
||||
`build_engine_compiler(config)` wires the existing
|
||||
`build_inference_runtime` factory through to a new `EngineCompiler`
|
||||
instance with a c10-scoped structured logger; `build_backbone_specs
|
||||
(config)` materialises the `BackboneSpec` tuple from
|
||||
`config.components['c10_provisioning'].backbones`.
|
||||
|
||||
### Modified (production)
|
||||
|
||||
- `src/gps_denied_onboard/components/c10_provisioning/__init__.py` —
|
||||
re-exports the AZ-321 public surface (`EngineCompiler`,
|
||||
`BackboneSpec`, `EngineCompileRequest`, `EngineCompileResult`,
|
||||
`CompileOutcome`, `EngineCompileSummary`, `CompileEngineCallable`,
|
||||
`BackboneConfig`, `C10ProvisioningConfig`) and registers the new
|
||||
config block via `register_component_block("c10_provisioning",
|
||||
C10ProvisioningConfig)`. `CacheProvisioner` / `Manifest` /
|
||||
`EngineCacheEntry` re-exports unchanged.
|
||||
|
||||
### New (tests)
|
||||
|
||||
- `tests/unit/c10_provisioning/test_engine_compiler.py` — **NEW**
|
||||
Tier-1 suite covering every AC + the 2 Tier-2 NFR placeholders:
|
||||
- **AC-1** cold cache + 3 backbones → all `BUILT`; 3 `.engine` +
|
||||
3 `.sha256` files on disk; 3 `c10.engine.cache.miss` WARN logs;
|
||||
1 `c10.engine.compile.summary` INFO log with `engines_built=3`.
|
||||
- **AC-2** warm cache + identical request → all `REUSED`;
|
||||
`compile_duration_s is None` for every result; ZERO calls to the
|
||||
fake runtime; 3 `c10.engine.cache.hit` INFO logs.
|
||||
- **AC-3** mixed (1 hit + 2 miss) — DINOv2 reused, LightGlue +
|
||||
ALIKED built; 2 calls to the fake runtime.
|
||||
- **AC-4** hardware change (sm 87→89, jp 6.2→6.3, trt 10.3→10.5):
|
||||
every backbone rebuilt at the new filename; old files at the old
|
||||
filename untouched on disk.
|
||||
- **AC-5** tampered sidecar (overwrite LightGlue's `.sha256` with
|
||||
`0`×64): LightGlue rebuilt; DINOv2 + ALIKED still reused; 1
|
||||
`c10.engine.sidecar.mismatch` WARN log with `model_name=
|
||||
lightglue` and `reason=digest_mismatch`. Plus a sibling case
|
||||
where the sidecar file is deleted entirely (`Sha256Sidecar.verify`
|
||||
raises) — same WARN-then-rebuild outcome.
|
||||
- **AC-6** `EngineBuildError` mid-corpus (backbone 2 of 3 fails):
|
||||
error propagates; backbone 1 (pre-cached, reused) untouched on
|
||||
disk; backbone 2's would-be engine NOT on disk (atomic-write
|
||||
guarantee from the fake mirrors AZ-298's real behaviour);
|
||||
backbone 3 never attempted (single call recorded for backbone 2).
|
||||
- **AC-7** `CalibrationCacheError` propagates with the
|
||||
`c10.engine.compile.error` ERROR log carrying `model_name`,
|
||||
`calibration_path`, `error_class=CalibrationCacheError`.
|
||||
- **AC-8** filename is exactly
|
||||
`dinov2_vpr__sm87_jp6.2_trt10.3_fp16.engine` (per AZ-281
|
||||
canonical schema with the `__` separator between model and
|
||||
`sm`); sidecar at `*.engine.sha256` with 64-hex digest;
|
||||
`EngineFilenameSchema.parse` round-trip + `Sha256Sidecar.verify`
|
||||
both pass.
|
||||
- **AC-9** `compile_duration_s` is a positive float for every
|
||||
`BUILT` result, `None` for every `REUSED` result.
|
||||
- **AC-10** empty `backbones` tuple → empty result; ZERO runtime
|
||||
calls; ZERO files written; 1 summary log with all-zero counts.
|
||||
- **NFR-perf-cache-hit** Tier-2 placeholder skip (200 MB engine
|
||||
sweep belongs in the AZ-321 microbench harness on Jetson).
|
||||
- **NFR-reliability-atomic-write** Tier-2 placeholder skip (kill-
|
||||
during-compile scenario lives in the microbench harness; the
|
||||
atomicity contract itself is owned by AZ-280's tests).
|
||||
|
||||
The Tier-1 tests use a `_FakeRuntime` that satisfies
|
||||
`CompileEngineCallable` and writes deterministic engine bytes via
|
||||
the REAL `Sha256Sidecar.write_atomic_and_sidecar` — so the cache-hit
|
||||
/ cache-miss / tampered-sidecar paths run against the same helper
|
||||
the production wiring uses. Only the C7-runtime-specific compile
|
||||
internals (TRT engine bytes, calibration cache, GPU memory) are
|
||||
mocked.
|
||||
|
||||
### Modified (docs)
|
||||
|
||||
- `_docs/02_document/module-layout.md` — c10_provisioning Per-
|
||||
Component Mapping now lists the new internal modules
|
||||
(`engine_compiler.py`, `config.py`) and the composition-root
|
||||
`c10_factory.py`; the Public API re-export list is extended with
|
||||
the AZ-321 surface; the `Config block` line is added (registered
|
||||
on import). `default_provisioner.py` row marked `pending` until
|
||||
the AZ-325 task lands.
|
||||
|
||||
## Acceptance criteria coverage
|
||||
|
||||
| AC | Test | Status |
|
||||
|----|------|--------|
|
||||
| AC-1 Cold cache → all built | `test_ac1_cold_cache_compiles_every_backbone` | passing |
|
||||
| AC-2 Warm cache → all reused, zero compile calls | `test_ac2_warm_cache_reuses_every_backbone` | passing |
|
||||
| AC-3 Mixed cache | `test_ac3_mixed_cache_hits_and_misses` | passing |
|
||||
| AC-4 Hardware change invalidates filename | `test_ac4_hardware_change_invalidates_cache` | passing |
|
||||
| AC-5 Tampered sidecar + missing sidecar paths | `test_ac5_tampered_sidecar_invalidates_that_engine` + `test_missing_sidecar_treated_as_cache_miss` | passing |
|
||||
| AC-6 `EngineBuildError` propagates, partial state consistent | `test_ac6_engine_build_error_propagates_and_third_backbone_untouched` | passing |
|
||||
| AC-7 `CalibrationCacheError` propagates with diagnostic log | `test_ac7_calibration_cache_error_propagates` | passing |
|
||||
| AC-8 Filename + sidecar layout matches AZ-281 schema | `test_ac8_filename_and_sidecar_layout` | passing |
|
||||
| AC-9 `compile_duration_s` recorded for built only | `test_ac9_compile_duration_recorded_for_built_only` | passing |
|
||||
| AC-10 Empty backbones → empty result, no side effects | `test_ac10_empty_backbones_returns_empty` | passing |
|
||||
| NFR-perf-cache-hit p99 ≤ 1.5 s for 200 MB engine | `test_nfr_perf_cache_hit_p99_under_1500ms_for_200mb_engine` (Tier-2) | Tier-2 skipped |
|
||||
| NFR-reliability atomic-write no half-engine after kill | `test_nfr_reliability_atomic_write_no_half_engine_after_kill` (Tier-2) | Tier-2 skipped |
|
||||
|
||||
## AC Test Coverage: 10 of 10 covered (+ 2 NFRs)
|
||||
## Code Review Verdict: PASS_WITH_WARNINGS (4 Low accepted; see Findings)
|
||||
## Auto-Fix Attempts: 0
|
||||
## Stuck Agents: None
|
||||
|
||||
## Findings (self-review)
|
||||
|
||||
| # | Severity | Category | Location | Note | Resolution |
|
||||
|---|----------|----------|----------|------|------------|
|
||||
| 1 | Low | Architecture | `engine_compiler.py::_compile_one` | Catches the broad `Exception` (not the specific AZ-297 `RuntimeError` family) because the c10 layer cannot import `components.c7_inference` (architecture rule `test_az270_compose_root.test_ac6`). The C7 contract scopes its runtime exceptions to its own family; ANY exception bubbling out of `compile_engine` is treated as a compile failure here. Re-raise preserves the original type. Inline comment documents the rule. | Open (Low) — accepted; architecture rule wins. |
|
||||
| 2 | Low | Maintainability | `engine_compiler.py::CompileEngineCallable` | Duplicates the `compile_engine` method shape from the C7 `InferenceRuntime` Protocol. Mirrors the LightGlue dual-Protocol pattern already in `_types/manifests.py` (consumer-side structural cut vs. producer-side opaque marker). | Open (Low) — accepted; matches established pattern. |
|
||||
| 3 | Low | Architecture | `engine_compiler.py` ↔ C7InferenceConfig | `EngineCompileRequest.cache_root` MUST equal the directory the C7 runtime writes to (`C7InferenceConfig.engine_cache_dir`). The composition root (`build_engine_compiler` + the C10 corpus driver T5 in AZ-325) is responsible for keeping the two in sync; the compiler itself trusts the request. A divergence would cause cache hits to always miss. | Open (Low) — flagged for AZ-325 to enforce. |
|
||||
| 4 | Low | Scope | `engine_compiler.py::_build_config_for_backbone` | Synthesises exactly one `OptimizationProfile` with `min == opt == max == expected_input_shape`. Backbones requiring dynamic input ranges would need a richer `BackboneSpec` carrying explicit `OptimizationProfile` tuples. None of the AZ-321 corpus backbones (DINOv2-VPR, LightGlue, ALIKED) need dynamic shapes today, but the limitation is real. | Open (Low) — accepted; future extension. |
|
||||
|
||||
## Tracker
|
||||
|
||||
- AZ-321 transitioned to **In Progress** at session start; will move
|
||||
to **In Testing** post-commit per `protocols.md`.
|
||||
|
||||
## Test suite
|
||||
|
||||
- `tests/unit/c10_provisioning/` — 13 passing, 2 Tier-2 skips
|
||||
(cache-hit p99 NFR + atomic-write kill scenario).
|
||||
- Combined unit suite excluding pending components (c1, c2, c2.5,
|
||||
c3, c3.5, c4, c5, c8, c11, c12) and the c6 collection blocker on
|
||||
this host (missing `psycopg_pool` is a known dev-machine env issue,
|
||||
pre-existing) — 543 passing, 21 environment-skipped, 1 warning
|
||||
(pre-existing `pynvml` FutureWarning unrelated to AZ-321).
|
||||
|
||||
## Next batch
|
||||
|
||||
Cycle 1 advances per the greenfield queue — autodev re-detects the
|
||||
next AZ ticket in the Step 7 batch loop. AZ-321 unblocks AZ-322
|
||||
(C10 Descriptor Batcher), AZ-337 (C2 UltraVPR), AZ-345 / AZ-346 /
|
||||
AZ-347 (C3 matchers), and AZ-349 (C3.5 refiner) at the topological
|
||||
level; the next ready batch is computed by `compute-next-batch`.
|
||||
|
||||
A cumulative review (batches 31–33) will fire at the next sub-skill
|
||||
phase boundary per Step 14.5's K=3 trigger.
|
||||
@@ -0,0 +1,54 @@
|
||||
# Code Review Report — Batch 33 / Cycle 1
|
||||
|
||||
**Batch**: 33
|
||||
**Tasks**: AZ-321 (C10 EngineCompiler)
|
||||
**Date**: 2026-05-12
|
||||
**Verdict**: PASS_WITH_WARNINGS
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title |
|
||||
|---|----------|----------|-----------|-------|
|
||||
| 1 | Low | Architecture | `engine_compiler.py::_compile_one` | Broad `except Exception` rather than the AZ-297 `RuntimeError` family |
|
||||
| 2 | Low | Maintainability | `engine_compiler.py::CompileEngineCallable` | Duplicates the `compile_engine` signature from the C7 `InferenceRuntime` Protocol |
|
||||
| 3 | Low | Architecture | `engine_compiler.py` ↔ C7InferenceConfig | `EngineCompileRequest.cache_root` MUST mirror `C7InferenceConfig.engine_cache_dir` — invariant enforced by the composition root, not the compiler |
|
||||
| 4 | Low | Scope | `engine_compiler.py::_build_config_for_backbone` | Single static `OptimizationProfile` synthesised per backbone; dynamic shape ranges out of scope |
|
||||
|
||||
### Finding Details
|
||||
|
||||
**F1: Broad `except Exception` on `compile_engine`** (Low / Architecture)
|
||||
- Location: `src/gps_denied_onboard/components/c10_provisioning/engine_compiler.py::EngineCompiler._compile_one`
|
||||
- Description: The C7 contract scopes `InferenceRuntime.compile_engine` exceptions to the C7-local `RuntimeError` family (`EngineBuildError`, `CalibrationCacheError`, ...). The c10 layer is forbidden from importing `components.c7_inference` (architecture rule `test_az270_compose_root.test_ac6` walks all `components/*/*.py` files and flags any cross-component import — including TYPE_CHECKING-guarded ones). Catching the broader `Exception` and dispatching by `type(exc).__name__` in the log payload is the cheapest fix that respects the rule. Re-raise preserves the original exception type for the caller.
|
||||
- Suggestion: A longer-term cleanup would either (a) hoist the C7 error envelope to `_types/inference.py` (parallels the `EngineCacheEntry` move) or (b) extend the architecture-lint to allow Public-API-only imports from sibling components. Both are bigger scope than AZ-321.
|
||||
- Task: AZ-321
|
||||
- Resolution: Open (Low) — accepted as documented; inline comment in source.
|
||||
|
||||
**F2: `CompileEngineCallable` shadow Protocol** (Low / Maintainability)
|
||||
- Location: `engine_compiler.py::CompileEngineCallable`
|
||||
- Description: Defines a structural Protocol carrying only the single `compile_engine` method shape — the c10 compiler's narrow consumer cut of the AZ-297 `InferenceRuntime` Protocol. Mirrors the LightGlue dual-Protocol pattern already documented in `_types/manifests.py` (`EngineHandle` consumer-cut Protocol vs `c7_inference.EngineHandle` opaque marker class).
|
||||
- Suggestion: None — same pattern already accepted across the codebase. A future "Public-API cross-component import allowlist" lint update could collapse this dual.
|
||||
- Task: AZ-321
|
||||
- Resolution: Open (Low) — accepted; matches existing pattern.
|
||||
|
||||
**F3: `cache_root` / `engine_cache_dir` invariant** (Low / Architecture)
|
||||
- Location: `engine_compiler.py::EngineCompileRequest.cache_root` vs `C7InferenceConfig.engine_cache_dir`
|
||||
- Description: The compiler's cache-hit detection writes nothing — it just checks `target_path.exists()` + `Sha256Sidecar.verify`. The C7 runtime owns the `.engine` write path and writes to `C7InferenceConfig.engine_cache_dir`. If the composition root passes a different `cache_root` to the compiler than the C7 runtime is configured for, cache hits will never fire (the compiler will look at the wrong directory). The compiler itself trusts the request; the wiring invariant lives in `build_engine_compiler` and (later) the AZ-325 `CacheProvisioner` driver.
|
||||
- Suggestion: AZ-325 (C10 Cache Provisioner — the orchestrator T5 that drives the compiler) should pull both paths from the same config field or assert their equality at construction time. The compiler stays scope-bound to one model at a time.
|
||||
- Task: AZ-321 (flag for AZ-325 follow-up)
|
||||
- Resolution: Open (Low) — accepted; flagged for AZ-325.
|
||||
|
||||
**F4: Single static `OptimizationProfile` per backbone** (Low / Scope)
|
||||
- Location: `engine_compiler.py::_build_config_for_backbone`
|
||||
- Description: The synthesised `BuildConfig` carries exactly one `OptimizationProfile` with `min == opt == max == expected_input_shape`. Backbones requiring dynamic input ranges (variable batch size, variable image resolution) would need a richer `BackboneSpec` carrying explicit `OptimizationProfile` tuples. AZ-321's named corpus (DINOv2-VPR, LightGlue, ALIKED) uses fixed shapes; this is OK today but a real limitation.
|
||||
- Suggestion: When the first dynamic-shape backbone arrives, extend `BackboneSpec` with a `dynamic_profiles: tuple[OptimizationProfile, ...]` field and prefer it over the synthetic single-profile fallback inside `_build_config_for_backbone`.
|
||||
- Task: AZ-321
|
||||
- Resolution: Open (Low) — accepted; future extension.
|
||||
|
||||
## Verdict Logic
|
||||
|
||||
- 0 Critical
|
||||
- 0 High
|
||||
- 0 Medium
|
||||
- 4 Low
|
||||
|
||||
→ **PASS_WITH_WARNINGS**: only Low findings; all accepted as documented.
|
||||
@@ -6,10 +6,10 @@ step: 7
|
||||
name: Implement
|
||||
status: in_progress
|
||||
sub_step:
|
||||
phase: 3
|
||||
name: compute-next-batch
|
||||
phase: 14
|
||||
name: loop-to-next-batch
|
||||
detail: ""
|
||||
retry_count: 0
|
||||
cycle: 1
|
||||
tracker: jira
|
||||
last_completed_batch: 32
|
||||
last_completed_batch: 33
|
||||
|
||||
@@ -3,10 +3,45 @@
|
||||
``EngineCacheEntry`` is the C7 canonical DTO (frozen at AZ-297) and
|
||||
lives at the L1 ``_types`` layer so C10 can re-export it without
|
||||
crossing the components.* boundary (architecture rule AC-6).
|
||||
|
||||
The AZ-321 ``EngineCompiler`` plus its DTOs are re-exported here so
|
||||
the composition root and downstream operator-tooling code consume
|
||||
them through this single contract surface.
|
||||
"""
|
||||
|
||||
from gps_denied_onboard._types.inference import EngineCacheEntry
|
||||
from gps_denied_onboard._types.manifests import Manifest
|
||||
from gps_denied_onboard.components.c10_provisioning.interface import CacheProvisioner
|
||||
from gps_denied_onboard.components.c10_provisioning.config import (
|
||||
BackboneConfig,
|
||||
C10ProvisioningConfig,
|
||||
)
|
||||
from gps_denied_onboard.components.c10_provisioning.engine_compiler import (
|
||||
BackboneSpec,
|
||||
CompileEngineCallable,
|
||||
CompileOutcome,
|
||||
EngineCompileRequest,
|
||||
EngineCompileResult,
|
||||
EngineCompileSummary,
|
||||
EngineCompiler,
|
||||
)
|
||||
from gps_denied_onboard.components.c10_provisioning.interface import (
|
||||
CacheProvisioner,
|
||||
)
|
||||
from gps_denied_onboard.config.schema import register_component_block
|
||||
|
||||
__all__ = ["CacheProvisioner", "EngineCacheEntry", "Manifest"]
|
||||
register_component_block("c10_provisioning", C10ProvisioningConfig)
|
||||
|
||||
__all__ = [
|
||||
"BackboneConfig",
|
||||
"BackboneSpec",
|
||||
"C10ProvisioningConfig",
|
||||
"CacheProvisioner",
|
||||
"CompileEngineCallable",
|
||||
"CompileOutcome",
|
||||
"EngineCacheEntry",
|
||||
"EngineCompileRequest",
|
||||
"EngineCompileResult",
|
||||
"EngineCompileSummary",
|
||||
"EngineCompiler",
|
||||
"Manifest",
|
||||
]
|
||||
|
||||
@@ -0,0 +1,109 @@
|
||||
"""C10 cache-provisioning config block (AZ-321).
|
||||
|
||||
Registered into ``config.components['c10_provisioning']`` by the
|
||||
package ``__init__.py``. The composition-root factory
|
||||
:func:`gps_denied_onboard.runtime_root.c10_factory.build_engine_compiler`
|
||||
reads this block to enumerate the project's backbones and to bound
|
||||
the workspace memory passed to
|
||||
:meth:`InferenceRuntime.compile_engine`.
|
||||
|
||||
Backbone enumeration is config-driven (not hardcoded) so a new model
|
||||
is a YAML change rather than a code change — see the AZ-321 task
|
||||
spec §Constraints.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
|
||||
from gps_denied_onboard.config.schema import ConfigError
|
||||
|
||||
__all__ = [
|
||||
"BackboneConfig",
|
||||
"C10ProvisioningConfig",
|
||||
]
|
||||
|
||||
|
||||
_DEFAULT_WORKSPACE_MB: int = 4096
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class BackboneConfig:
|
||||
"""One backbone the C10 corpus needs an engine for.
|
||||
|
||||
``onnx_path`` is the absolute path to the source ``.onnx`` file;
|
||||
the path is resolved by the composition root, not by this
|
||||
dataclass, so we keep it as a string here for cheap YAML round-
|
||||
trip.
|
||||
|
||||
``expected_input_shape`` is parsed into a
|
||||
:class:`gps_denied_onboard.components.c10_provisioning.engine_compiler.BackboneSpec`
|
||||
at factory time; this dataclass keeps it as a tuple because frozen
|
||||
dataclasses need hashable fields.
|
||||
"""
|
||||
|
||||
model_name: str
|
||||
onnx_path: str
|
||||
expected_input_shape: tuple[int, ...]
|
||||
input_name: str = "input"
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
if not self.model_name:
|
||||
raise ConfigError(
|
||||
"BackboneConfig.model_name must be a non-empty string"
|
||||
)
|
||||
if not self.onnx_path:
|
||||
raise ConfigError(
|
||||
f"BackboneConfig({self.model_name!r}).onnx_path must "
|
||||
"be a non-empty string"
|
||||
)
|
||||
if not self.expected_input_shape:
|
||||
raise ConfigError(
|
||||
f"BackboneConfig({self.model_name!r}).expected_input_shape "
|
||||
"must be a non-empty tuple of positive ints"
|
||||
)
|
||||
for dim in self.expected_input_shape:
|
||||
if not isinstance(dim, int) or isinstance(dim, bool) or dim <= 0:
|
||||
raise ConfigError(
|
||||
f"BackboneConfig({self.model_name!r}).expected_input_shape "
|
||||
f"contains non-positive or non-int dim: {dim!r}"
|
||||
)
|
||||
if not self.input_name:
|
||||
raise ConfigError(
|
||||
f"BackboneConfig({self.model_name!r}).input_name must "
|
||||
"be a non-empty string"
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class C10ProvisioningConfig:
|
||||
"""Per-component config for C10 cache provisioning.
|
||||
|
||||
``backbones`` enumerates the project's engine corpus; default is
|
||||
empty so a unit test or replay run that has no use for engines
|
||||
can leave this unconfigured. Production deployments populate it
|
||||
via YAML.
|
||||
|
||||
``workspace_mb`` is the per-engine workspace allocation passed
|
||||
into :class:`BuildConfig`; defaults to 4 GiB which matches the
|
||||
C7 NFT-LIM-01 GPU memory budget. Operators can dial it down for
|
||||
Tier-2 compile workstations with less GPU memory.
|
||||
"""
|
||||
|
||||
backbones: tuple[BackboneConfig, ...] = field(default_factory=tuple)
|
||||
workspace_mb: int = _DEFAULT_WORKSPACE_MB
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
if self.workspace_mb <= 0:
|
||||
raise ConfigError(
|
||||
"C10ProvisioningConfig.workspace_mb must be > 0; "
|
||||
f"got {self.workspace_mb}"
|
||||
)
|
||||
seen: set[str] = set()
|
||||
for backbone in self.backbones:
|
||||
if backbone.model_name in seen:
|
||||
raise ConfigError(
|
||||
"C10ProvisioningConfig.backbones contains duplicate "
|
||||
f"model_name {backbone.model_name!r}"
|
||||
)
|
||||
seen.add(backbone.model_name)
|
||||
@@ -0,0 +1,407 @@
|
||||
"""C10 ``EngineCompiler`` — per-model TRT compile + hardware-tied cache reuse (AZ-321).
|
||||
|
||||
Public surface frozen by `_docs/02_document/components/11_c10_provisioning/description.md`
|
||||
§5 (error handling) + §7 (D-C10-6 calibration-cache reuse, D-C10-7 self-describing
|
||||
filename).
|
||||
|
||||
Responsibilities
|
||||
----------------
|
||||
|
||||
For every :class:`BackboneSpec` in :class:`EngineCompileRequest` the
|
||||
compiler:
|
||||
|
||||
1. Computes the canonical engine filename via AZ-281
|
||||
:class:`EngineFilenameSchema` from the host's
|
||||
:class:`HostCapabilities` plus the request precision.
|
||||
2. If the engine is already on disk at
|
||||
``{cache_root}/{filename}`` AND
|
||||
:meth:`Sha256Sidecar.verify` returns ``True`` for that path:
|
||||
treats it as a cache hit (``CompileOutcome.REUSED``) and returns a
|
||||
canonical :class:`EngineCacheEntry` synthesised from the sidecar.
|
||||
Zero calls to the injected :class:`InferenceRuntime`.
|
||||
3. Otherwise delegates to
|
||||
:meth:`InferenceRuntime.compile_engine` (AZ-298 / AZ-299 / AZ-300
|
||||
own the write path; the runtime atomically writes both the
|
||||
``.engine`` binary and its ``.sha256`` sidecar). The compiler does
|
||||
NOT double-write the file — the task spec's "engine bytes are
|
||||
returned by compile_engine then written via the sidecar" wording
|
||||
contradicts the actual AZ-297 Protocol (``compile_engine`` returns
|
||||
an :class:`EngineCacheEntry`, not raw bytes); the Protocol shipped
|
||||
first and wins.
|
||||
|
||||
Hardware-tied cache reuse (D-C10-6) is satisfied by the filename
|
||||
construction: an engine compiled on ``(sm=87, jp=6.2, trt=10.3, fp16)``
|
||||
lives at a different path than one compiled on
|
||||
``(sm=89, jp=6.3, trt=10.5, fp16)`` so a hardware change naturally
|
||||
forces a rebuild — the compiler does NOT load nor delete stale
|
||||
engines (AC-4).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import time
|
||||
from dataclasses import dataclass
|
||||
from enum import Enum
|
||||
from pathlib import Path
|
||||
from typing import Protocol, runtime_checkable
|
||||
|
||||
from gps_denied_onboard._types.inference import (
|
||||
BuildConfig,
|
||||
EngineCacheEntry,
|
||||
OptimizationProfile,
|
||||
PrecisionMode,
|
||||
)
|
||||
from gps_denied_onboard._types.manifests import HostCapabilities
|
||||
from gps_denied_onboard.helpers.engine_filename_schema import (
|
||||
EngineFilenameSchema,
|
||||
)
|
||||
from gps_denied_onboard.helpers.sha256_sidecar import (
|
||||
Sha256Sidecar,
|
||||
Sha256SidecarError,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"BackboneSpec",
|
||||
"CompileEngineCallable",
|
||||
"CompileOutcome",
|
||||
"EngineCompileRequest",
|
||||
"EngineCompileResult",
|
||||
"EngineCompileSummary",
|
||||
"EngineCompiler",
|
||||
]
|
||||
|
||||
|
||||
_DEFAULT_WORKSPACE_MB: int = 4096
|
||||
|
||||
|
||||
@runtime_checkable
|
||||
class CompileEngineCallable(Protocol):
|
||||
"""Structural cut of the C7 ``InferenceRuntime`` Protocol (AZ-297).
|
||||
|
||||
The compiler only ever calls
|
||||
:meth:`InferenceRuntime.compile_engine`, so it accepts any object
|
||||
that structurally satisfies this narrow Protocol. This keeps the
|
||||
c10 component free of cross-component imports (architecture rule
|
||||
``test_az270_compose_root.test_ac6``) while still letting the real
|
||||
:class:`gps_denied_onboard.components.c7_inference.InferenceRuntime`
|
||||
plug in unchanged via duck typing — the composition root wires the
|
||||
concrete strategy in. Same dual-Protocol pattern used by the
|
||||
LightGlue ``EngineHandle`` consumer cut in ``_types/manifests.py``.
|
||||
"""
|
||||
|
||||
def compile_engine(
|
||||
self, model_path: Path, build_config: BuildConfig
|
||||
) -> EngineCacheEntry: ...
|
||||
|
||||
|
||||
class CompileOutcome(str, Enum):
|
||||
"""Per-backbone outcome of one ``compile_engines_for_corpus`` call."""
|
||||
|
||||
BUILT = "built"
|
||||
REUSED = "reused"
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class BackboneSpec:
|
||||
"""One model the corpus needs an engine for.
|
||||
|
||||
``input_name`` defaults to ``"input"`` because most exported ONNX
|
||||
graphs in this project use that name; backbones with a different
|
||||
input name must override it. ``expected_input_shape`` is used to
|
||||
synthesise a single :class:`OptimizationProfile` with
|
||||
``min == opt == max``; backbones that need explicit dynamic ranges
|
||||
should be split into separate :class:`OptimizationProfile`-aware
|
||||
helpers and supplied via ``custom_profiles`` (out of scope for the
|
||||
AZ-321 corpus; reserved for a later extension).
|
||||
"""
|
||||
|
||||
model_name: str
|
||||
onnx_path: Path
|
||||
expected_input_shape: tuple[int, ...]
|
||||
input_name: str = "input"
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class EngineCompileRequest:
|
||||
"""Inputs to one ``compile_engines_for_corpus`` invocation.
|
||||
|
||||
``host`` is passed in (rather than introspected via the runtime)
|
||||
because the AZ-297 :class:`InferenceRuntime` Protocol does not
|
||||
expose host-info; the composition root resolves
|
||||
:class:`HostCapabilities` from device probes (Tier-2) or test
|
||||
fixtures (Tier-1) and threads it through here. This keeps the
|
||||
compiler decoupled from the runtime's introspection surface and
|
||||
makes the AC-4 (hardware change) test trivial.
|
||||
"""
|
||||
|
||||
backbones: tuple[BackboneSpec, ...]
|
||||
calibration_path: Path | None
|
||||
cache_root: Path
|
||||
precision: PrecisionMode
|
||||
host: HostCapabilities
|
||||
workspace_mb: int = _DEFAULT_WORKSPACE_MB
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class EngineCompileResult:
|
||||
"""One backbone's outcome record after ``compile_engines_for_corpus``.
|
||||
|
||||
``entry`` is the canonical
|
||||
:class:`gps_denied_onboard._types.inference.EngineCacheEntry` —
|
||||
same shape whether the engine was freshly built or reused. The
|
||||
surrounding ``outcome`` + ``compile_duration_s`` are c10-local
|
||||
bookkeeping (the AZ-321 task spec called this combined record
|
||||
``EngineCacheEntry`` but that name is already taken by the AZ-297
|
||||
canonical DTO; the canonical shape wins and the wrapper takes a
|
||||
new name).
|
||||
"""
|
||||
|
||||
entry: EngineCacheEntry
|
||||
outcome: CompileOutcome
|
||||
compile_duration_s: float | None
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class EngineCompileSummary:
|
||||
"""Aggregate counts surfaced via the ``c10.engine.compile.summary`` log."""
|
||||
|
||||
engines_built: int
|
||||
engines_reused: int
|
||||
cache_hit_ratio: float
|
||||
|
||||
|
||||
class EngineCompiler:
|
||||
"""Compile or reuse TensorRT engines for every backbone in a corpus.
|
||||
|
||||
The compiler is stateless across calls; ``__init__`` only injects
|
||||
the collaborators it cannot construct itself
|
||||
(the :class:`InferenceRuntime` is composition-root-owned; the
|
||||
logger is named per component).
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
inference_runtime: CompileEngineCallable,
|
||||
logger: logging.Logger,
|
||||
) -> None:
|
||||
self._runtime = inference_runtime
|
||||
self._log = logger
|
||||
|
||||
def compile_engines_for_corpus(
|
||||
self, request: EngineCompileRequest
|
||||
) -> tuple[EngineCompileResult, ...]:
|
||||
"""Compile or reuse one engine per backbone in ``request.backbones``.
|
||||
|
||||
Empty ``backbones`` → empty result and a summary log with
|
||||
all-zero counts (AC-10). Errors from
|
||||
:meth:`InferenceRuntime.compile_engine` are NOT caught here —
|
||||
they propagate to the caller (AC-6 / AC-7). Side effects on
|
||||
backbones implemented before the failing one are visible on
|
||||
disk; the compiler does NOT roll back (AZ-298's atomic-write
|
||||
guarantees no half-engine).
|
||||
"""
|
||||
|
||||
engines_dir = request.cache_root
|
||||
engines_dir.mkdir(parents=True, exist_ok=True)
|
||||
results: list[EngineCompileResult] = []
|
||||
for backbone in request.backbones:
|
||||
result = self._compile_one(backbone, request)
|
||||
results.append(result)
|
||||
|
||||
summary = _summarise(results)
|
||||
self._log.info(
|
||||
"c10.engine.compile.summary",
|
||||
extra={
|
||||
"kind": "c10.engine.compile.summary",
|
||||
"kv": {
|
||||
"engines_built": summary.engines_built,
|
||||
"engines_reused": summary.engines_reused,
|
||||
"cache_hit_ratio": summary.cache_hit_ratio,
|
||||
"total": len(results),
|
||||
},
|
||||
},
|
||||
)
|
||||
return tuple(results)
|
||||
|
||||
def _compile_one(
|
||||
self,
|
||||
backbone: BackboneSpec,
|
||||
request: EngineCompileRequest,
|
||||
) -> EngineCompileResult:
|
||||
filename = EngineFilenameSchema.build(
|
||||
model_name=backbone.model_name,
|
||||
sm=request.host.sm,
|
||||
jetpack=request.host.jetpack,
|
||||
trt=request.host.trt,
|
||||
precision=request.precision.value,
|
||||
)
|
||||
target_path = request.cache_root / filename
|
||||
|
||||
cache_hit_entry = self._maybe_reuse(
|
||||
target_path, backbone, request
|
||||
)
|
||||
if cache_hit_entry is not None:
|
||||
self._log.info(
|
||||
"c10.engine.cache.hit",
|
||||
extra={
|
||||
"kind": "c10.engine.cache.hit",
|
||||
"kv": {
|
||||
"model_name": backbone.model_name,
|
||||
"engine_path": str(target_path),
|
||||
},
|
||||
},
|
||||
)
|
||||
return EngineCompileResult(
|
||||
entry=cache_hit_entry,
|
||||
outcome=CompileOutcome.REUSED,
|
||||
compile_duration_s=None,
|
||||
)
|
||||
|
||||
self._log.warning(
|
||||
"c10.engine.cache.miss",
|
||||
extra={
|
||||
"kind": "c10.engine.cache.miss",
|
||||
"kv": {
|
||||
"model_name": backbone.model_name,
|
||||
"target_filename": filename,
|
||||
},
|
||||
},
|
||||
)
|
||||
build_config = _build_config_for_backbone(backbone, request)
|
||||
t0 = time.perf_counter()
|
||||
try:
|
||||
entry = self._runtime.compile_engine(
|
||||
backbone.onnx_path, build_config
|
||||
)
|
||||
except Exception as exc:
|
||||
# The C7 InferenceRuntime contract scopes exceptions to its
|
||||
# `RuntimeError` family (`EngineBuildError`,
|
||||
# `CalibrationCacheError`, ...). The c10 layer is forbidden
|
||||
# from importing the c7 errors module (architecture rule
|
||||
# AC-6 / test_az270_compose_root.test_ac6); we catch the
|
||||
# broader `Exception` and dispatch by class name in the log
|
||||
# payload. Re-raising preserves the original type.
|
||||
self._log.error(
|
||||
"c10.engine.compile.error",
|
||||
extra={
|
||||
"kind": "c10.engine.compile.error",
|
||||
"kv": {
|
||||
"model_name": backbone.model_name,
|
||||
"calibration_path": (
|
||||
str(request.calibration_path)
|
||||
if request.calibration_path is not None
|
||||
else None
|
||||
),
|
||||
"error_class": type(exc).__name__,
|
||||
"message": str(exc),
|
||||
},
|
||||
},
|
||||
)
|
||||
raise
|
||||
elapsed_s = time.perf_counter() - t0
|
||||
return EngineCompileResult(
|
||||
entry=entry,
|
||||
outcome=CompileOutcome.BUILT,
|
||||
compile_duration_s=elapsed_s,
|
||||
)
|
||||
|
||||
def _maybe_reuse(
|
||||
self,
|
||||
target_path: Path,
|
||||
backbone: BackboneSpec,
|
||||
request: EngineCompileRequest,
|
||||
) -> EngineCacheEntry | None:
|
||||
"""Return a synthesised :class:`EngineCacheEntry` on cache hit; ``None`` on miss.
|
||||
|
||||
Side effect: emits a WARN log on a tampered / missing sidecar
|
||||
(the engine file exists but its sidecar is invalid). The
|
||||
recompile-on-miss branch is owned by the caller.
|
||||
"""
|
||||
|
||||
if not target_path.exists():
|
||||
return None
|
||||
try:
|
||||
verified = Sha256Sidecar.verify(target_path)
|
||||
except Sha256SidecarError as exc:
|
||||
self._log.warning(
|
||||
"c10.engine.sidecar.mismatch",
|
||||
extra={
|
||||
"kind": "c10.engine.sidecar.mismatch",
|
||||
"kv": {
|
||||
"model_name": backbone.model_name,
|
||||
"engine_path": str(target_path),
|
||||
"reason": str(exc),
|
||||
},
|
||||
},
|
||||
)
|
||||
return None
|
||||
if not verified:
|
||||
self._log.warning(
|
||||
"c10.engine.sidecar.mismatch",
|
||||
extra={
|
||||
"kind": "c10.engine.sidecar.mismatch",
|
||||
"kv": {
|
||||
"model_name": backbone.model_name,
|
||||
"engine_path": str(target_path),
|
||||
"reason": "digest_mismatch",
|
||||
},
|
||||
},
|
||||
)
|
||||
return None
|
||||
sidecar_text = (
|
||||
Path(str(target_path) + ".sha256").read_text().strip()
|
||||
)
|
||||
return EngineCacheEntry(
|
||||
engine_path=target_path,
|
||||
sha256_hex=sidecar_text,
|
||||
sm=request.host.sm,
|
||||
jp=request.host.jetpack,
|
||||
trt=request.host.trt,
|
||||
precision=request.precision,
|
||||
extras={},
|
||||
)
|
||||
|
||||
|
||||
def _build_config_for_backbone(
|
||||
backbone: BackboneSpec, request: EngineCompileRequest
|
||||
) -> BuildConfig:
|
||||
"""Synthesise a :class:`BuildConfig` from a :class:`BackboneSpec`.
|
||||
|
||||
Constructs exactly one :class:`OptimizationProfile` with
|
||||
``min == opt == max == expected_input_shape``; backbones with
|
||||
dynamic input ranges are out of scope for AZ-321 and would need
|
||||
a richer ``BackboneSpec`` variant.
|
||||
"""
|
||||
|
||||
profile = OptimizationProfile(
|
||||
input_name=backbone.input_name,
|
||||
min_shape=backbone.expected_input_shape,
|
||||
opt_shape=backbone.expected_input_shape,
|
||||
max_shape=backbone.expected_input_shape,
|
||||
)
|
||||
return BuildConfig(
|
||||
precision=request.precision,
|
||||
workspace_mb=request.workspace_mb,
|
||||
calibration_dataset=request.calibration_path,
|
||||
optimization_profiles=(profile,),
|
||||
)
|
||||
|
||||
|
||||
def _summarise(
|
||||
results: list[EngineCompileResult],
|
||||
) -> EngineCompileSummary:
|
||||
built = sum(
|
||||
1 for r in results if r.outcome is CompileOutcome.BUILT
|
||||
)
|
||||
reused = sum(
|
||||
1 for r in results if r.outcome is CompileOutcome.REUSED
|
||||
)
|
||||
total = len(results)
|
||||
ratio = reused / total if total > 0 else 0.0
|
||||
return EngineCompileSummary(
|
||||
engines_built=built,
|
||||
engines_reused=reused,
|
||||
cache_hit_ratio=ratio,
|
||||
)
|
||||
@@ -0,0 +1,85 @@
|
||||
"""C10 cache-provisioning factory (AZ-321).
|
||||
|
||||
Composition-root wiring for the AZ-321 :class:`EngineCompiler`. Reads
|
||||
``config.components['c10_provisioning']`` for the backbone corpus,
|
||||
resolves the :class:`InferenceRuntime` strategy via
|
||||
:func:`gps_denied_onboard.runtime_root.inference_factory.build_inference_runtime`,
|
||||
and returns a ready-to-call :class:`EngineCompiler`.
|
||||
|
||||
Backbone resolution is config-driven: the YAML enumerates the
|
||||
project's engine corpus (initially DINOv2-VPR + LightGlue + ALIKED
|
||||
per the AZ-321 task spec); adding a model is a config change rather
|
||||
than a code change.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from gps_denied_onboard.components.c10_provisioning import (
|
||||
BackboneSpec,
|
||||
EngineCompiler,
|
||||
)
|
||||
from gps_denied_onboard.components.c10_provisioning.config import (
|
||||
BackboneConfig,
|
||||
C10ProvisioningConfig,
|
||||
)
|
||||
from gps_denied_onboard.logging import get_logger
|
||||
from gps_denied_onboard.runtime_root.inference_factory import (
|
||||
build_inference_runtime,
|
||||
)
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from gps_denied_onboard.config.schema import Config
|
||||
|
||||
__all__ = [
|
||||
"build_backbone_specs",
|
||||
"build_engine_compiler",
|
||||
]
|
||||
|
||||
|
||||
def build_engine_compiler(config: "Config") -> EngineCompiler:
|
||||
"""Construct a wired :class:`EngineCompiler` from ``config``.
|
||||
|
||||
The factory:
|
||||
|
||||
1. Resolves the :class:`InferenceRuntime` via the existing
|
||||
C7 factory (honouring the ``BUILD_*`` gating and the runtime
|
||||
selection in ``config.components['c7_inference']``).
|
||||
2. Names a c10-scoped structured logger.
|
||||
3. Hands both to :class:`EngineCompiler`.
|
||||
|
||||
The :class:`BackboneSpec` corpus is NOT materialised by this
|
||||
factory — call :func:`build_backbone_specs` separately so the
|
||||
operator binary can pick up the spec list after Step 7 of the
|
||||
autodev flow without dragging an :class:`InferenceRuntime` along.
|
||||
"""
|
||||
|
||||
runtime = build_inference_runtime(config)
|
||||
logger = get_logger("c10_provisioning")
|
||||
return EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
|
||||
|
||||
def build_backbone_specs(config: "Config") -> tuple[BackboneSpec, ...]:
|
||||
"""Materialise :class:`BackboneSpec` tuple from
|
||||
``config.components['c10_provisioning'].backbones``.
|
||||
|
||||
Resolves each :class:`BackboneConfig` ``onnx_path`` string into
|
||||
an absolute :class:`Path` (validation happened at load time via
|
||||
:meth:`BackboneConfig.__post_init__`).
|
||||
"""
|
||||
|
||||
block: C10ProvisioningConfig = config.components["c10_provisioning"]
|
||||
return tuple(_backbone_spec_from_config(bb) for bb in block.backbones)
|
||||
|
||||
|
||||
def _backbone_spec_from_config(
|
||||
backbone: BackboneConfig,
|
||||
) -> BackboneSpec:
|
||||
return BackboneSpec(
|
||||
model_name=backbone.model_name,
|
||||
onnx_path=Path(backbone.onnx_path),
|
||||
expected_input_shape=tuple(backbone.expected_input_shape),
|
||||
input_name=backbone.input_name,
|
||||
)
|
||||
@@ -0,0 +1,619 @@
|
||||
"""Unit tests for AZ-321 :class:`EngineCompiler`.
|
||||
|
||||
Covers the 10 ACs + 2 NFRs in the AZ-321 task spec. Tier-1 tests use
|
||||
a fake :class:`InferenceRuntime` that writes scripted bytes via the
|
||||
real :class:`Sha256Sidecar` so the cache-hit / cache-miss / tampered-
|
||||
sidecar paths exercise the production helpers. NFR perf + atomic-
|
||||
write skips are Tier-2 placeholders kept for the microbench harness.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from gps_denied_onboard._types.inference import (
|
||||
BuildConfig,
|
||||
EngineCacheEntry,
|
||||
PrecisionMode,
|
||||
)
|
||||
from gps_denied_onboard._types.manifests import HostCapabilities
|
||||
from gps_denied_onboard.components.c10_provisioning import (
|
||||
BackboneSpec,
|
||||
CompileOutcome,
|
||||
EngineCompileRequest,
|
||||
EngineCompiler,
|
||||
)
|
||||
from gps_denied_onboard.components.c7_inference import (
|
||||
CalibrationCacheError,
|
||||
EngineBuildError,
|
||||
)
|
||||
from gps_denied_onboard.helpers.engine_filename_schema import (
|
||||
EngineFilenameSchema,
|
||||
)
|
||||
from gps_denied_onboard.helpers.sha256_sidecar import Sha256Sidecar
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# Fixtures
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
_HOST_T2: HostCapabilities = HostCapabilities(sm=87, jetpack="6.2", trt="10.3")
|
||||
_HOST_T2_NEXT: HostCapabilities = HostCapabilities(
|
||||
sm=89, jetpack="6.3", trt="10.5"
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class _FakeRuntime:
|
||||
"""Stand-in for a real C7 ``InferenceRuntime`` in Tier-1 tests.
|
||||
|
||||
``compile_engine`` writes deterministic engine bytes (a tiny
|
||||
payload derived from the model name) via the real
|
||||
:class:`Sha256Sidecar` to the same path the C7 production runtimes
|
||||
would. The compiler under test consumes the returned
|
||||
:class:`EngineCacheEntry` exactly as it would from
|
||||
:class:`TensorrtRuntime`.
|
||||
|
||||
Behaviour knobs:
|
||||
|
||||
- ``raise_on``: maps ``model_name`` → exception instance the fake
|
||||
raises instead of writing the file. Used by AC-6 / AC-7 to
|
||||
simulate a failure mid-corpus.
|
||||
- ``calls``: records each ``compile_engine`` call so the cache-hit
|
||||
AC can assert zero invocations.
|
||||
"""
|
||||
|
||||
cache_root: Path
|
||||
host: HostCapabilities = _HOST_T2
|
||||
raise_on: dict[str, Exception] = field(default_factory=dict)
|
||||
calls: list[tuple[Path, BuildConfig]] = field(default_factory=list)
|
||||
|
||||
def compile_engine(
|
||||
self, model_path: Path, build_config: BuildConfig
|
||||
) -> EngineCacheEntry:
|
||||
self.calls.append((model_path, build_config))
|
||||
model_name = Path(model_path).stem
|
||||
exc = self.raise_on.get(model_name)
|
||||
if exc is not None:
|
||||
raise exc
|
||||
|
||||
filename = EngineFilenameSchema.build(
|
||||
model_name=model_name,
|
||||
sm=self.host.sm,
|
||||
jetpack=self.host.jetpack,
|
||||
trt=self.host.trt,
|
||||
precision=build_config.precision.value,
|
||||
)
|
||||
target_path = self.cache_root / filename
|
||||
target_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
payload = (
|
||||
f"FAKE-ENGINE:{model_name}:{build_config.precision.value}"
|
||||
).encode("utf-8")
|
||||
sha_hex = Sha256Sidecar.write_atomic_and_sidecar(target_path, payload)
|
||||
return EngineCacheEntry(
|
||||
engine_path=target_path,
|
||||
sha256_hex=sha_hex,
|
||||
sm=self.host.sm,
|
||||
jp=self.host.jetpack,
|
||||
trt=self.host.trt,
|
||||
precision=build_config.precision,
|
||||
extras={"fake": "true"},
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def cache_root(tmp_path: Path) -> Path:
|
||||
root = tmp_path / "engines"
|
||||
root.mkdir(parents=True, exist_ok=True)
|
||||
return root
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def backbones(tmp_path: Path) -> tuple[BackboneSpec, ...]:
|
||||
onnx_dir = tmp_path / "onnx"
|
||||
onnx_dir.mkdir(parents=True, exist_ok=True)
|
||||
specs: list[BackboneSpec] = []
|
||||
for model_name in ("dinov2_vpr", "lightglue", "aliked"):
|
||||
onnx_path = onnx_dir / f"{model_name}.onnx"
|
||||
onnx_path.write_bytes(b"ONNX:" + model_name.encode("ascii"))
|
||||
specs.append(
|
||||
BackboneSpec(
|
||||
model_name=model_name,
|
||||
onnx_path=onnx_path,
|
||||
expected_input_shape=(1, 3, 224, 224),
|
||||
)
|
||||
)
|
||||
return tuple(specs)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def logger() -> logging.Logger:
|
||||
return logging.getLogger("test.c10_provisioning")
|
||||
|
||||
|
||||
def _request(
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
cache_root: Path,
|
||||
host: HostCapabilities = _HOST_T2,
|
||||
precision: PrecisionMode = PrecisionMode.FP16,
|
||||
calibration_path: Path | None = None,
|
||||
) -> EngineCompileRequest:
|
||||
return EngineCompileRequest(
|
||||
backbones=backbones,
|
||||
calibration_path=calibration_path,
|
||||
cache_root=cache_root,
|
||||
precision=precision,
|
||||
host=host,
|
||||
)
|
||||
|
||||
|
||||
def _populate_cache(
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
cache_root: Path,
|
||||
host: HostCapabilities = _HOST_T2,
|
||||
precision: PrecisionMode = PrecisionMode.FP16,
|
||||
) -> dict[str, Path]:
|
||||
"""Pre-write engine + sidecar for every backbone; return name→path map."""
|
||||
|
||||
cache_root.mkdir(parents=True, exist_ok=True)
|
||||
paths: dict[str, Path] = {}
|
||||
for spec in backbones:
|
||||
filename = EngineFilenameSchema.build(
|
||||
model_name=spec.model_name,
|
||||
sm=host.sm,
|
||||
jetpack=host.jetpack,
|
||||
trt=host.trt,
|
||||
precision=precision.value,
|
||||
)
|
||||
target_path = cache_root / filename
|
||||
payload = (
|
||||
f"PRE-WRITTEN:{spec.model_name}:{precision.value}"
|
||||
).encode("utf-8")
|
||||
Sha256Sidecar.write_atomic_and_sidecar(target_path, payload)
|
||||
paths[spec.model_name] = target_path
|
||||
return paths
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-1: cold cache compiles every backbone
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac1_cold_cache_compiles_every_backbone(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
caplog: pytest.LogCaptureFixture,
|
||||
) -> None:
|
||||
# Arrange
|
||||
runtime = _FakeRuntime(cache_root=cache_root)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(backbones, cache_root)
|
||||
|
||||
# Act
|
||||
with caplog.at_level(logging.DEBUG, logger=logger.name):
|
||||
results = compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Assert
|
||||
assert len(results) == 3
|
||||
for r in results:
|
||||
assert r.outcome is CompileOutcome.BUILT
|
||||
assert r.compile_duration_s is not None
|
||||
assert r.compile_duration_s >= 0.0
|
||||
assert r.entry.engine_path.exists()
|
||||
sidecar = Path(str(r.entry.engine_path) + ".sha256")
|
||||
assert sidecar.exists()
|
||||
assert Sha256Sidecar.verify(r.entry.engine_path) is True
|
||||
assert len(runtime.calls) == 3
|
||||
|
||||
miss_kinds = [
|
||||
rec for rec in caplog.records
|
||||
if rec.__dict__.get("kind") == "c10.engine.cache.miss"
|
||||
]
|
||||
summary_kinds = [
|
||||
rec for rec in caplog.records
|
||||
if rec.__dict__.get("kind") == "c10.engine.compile.summary"
|
||||
]
|
||||
assert len(miss_kinds) == 3
|
||||
assert len(summary_kinds) == 1
|
||||
assert summary_kinds[0].__dict__["kv"]["engines_built"] == 3
|
||||
assert summary_kinds[0].__dict__["kv"]["engines_reused"] == 0
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-2: warm cache reuses every backbone
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac2_warm_cache_reuses_every_backbone(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
caplog: pytest.LogCaptureFixture,
|
||||
) -> None:
|
||||
# Arrange
|
||||
_populate_cache(backbones, cache_root)
|
||||
runtime = _FakeRuntime(cache_root=cache_root)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(backbones, cache_root)
|
||||
|
||||
# Act
|
||||
with caplog.at_level(logging.DEBUG, logger=logger.name):
|
||||
results = compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Assert
|
||||
assert len(results) == 3
|
||||
for r in results:
|
||||
assert r.outcome is CompileOutcome.REUSED
|
||||
assert r.compile_duration_s is None
|
||||
assert runtime.calls == []
|
||||
hit_kinds = [
|
||||
rec for rec in caplog.records
|
||||
if rec.__dict__.get("kind") == "c10.engine.cache.hit"
|
||||
]
|
||||
summary = [
|
||||
rec for rec in caplog.records
|
||||
if rec.__dict__.get("kind") == "c10.engine.compile.summary"
|
||||
]
|
||||
assert len(hit_kinds) == 3
|
||||
assert len(summary) == 1
|
||||
assert summary[0].__dict__["kv"]["engines_reused"] == 3
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-3: mixed cache (1 hit + 2 miss)
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac3_mixed_cache_hits_and_misses(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
) -> None:
|
||||
# Arrange
|
||||
only_dinov2 = (backbones[0],)
|
||||
_populate_cache(only_dinov2, cache_root)
|
||||
runtime = _FakeRuntime(cache_root=cache_root)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(backbones, cache_root)
|
||||
|
||||
# Act
|
||||
results = compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Assert
|
||||
outcomes = {r.entry.engine_path.name: r.outcome for r in results}
|
||||
dinov2_outcomes = [
|
||||
v for k, v in outcomes.items() if k.startswith("dinov2_vpr__")
|
||||
]
|
||||
other_outcomes = [
|
||||
v for k, v in outcomes.items() if not k.startswith("dinov2_vpr__")
|
||||
]
|
||||
assert dinov2_outcomes == [CompileOutcome.REUSED]
|
||||
assert other_outcomes.count(CompileOutcome.BUILT) == 2
|
||||
assert len(runtime.calls) == 2
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-4: hardware change invalidates cache (all rebuilt; old files untouched)
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac4_hardware_change_invalidates_cache(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
) -> None:
|
||||
# Arrange
|
||||
old_paths = _populate_cache(backbones, cache_root, host=_HOST_T2)
|
||||
runtime = _FakeRuntime(cache_root=cache_root, host=_HOST_T2_NEXT)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(backbones, cache_root, host=_HOST_T2_NEXT)
|
||||
|
||||
# Act
|
||||
results = compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Assert
|
||||
for r in results:
|
||||
assert r.outcome is CompileOutcome.BUILT
|
||||
for old_path in old_paths.values():
|
||||
assert old_path.exists(), (
|
||||
f"old engine {old_path} should be untouched on hardware change"
|
||||
)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-5: tampered sidecar invalidates that one engine
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac5_tampered_sidecar_invalidates_that_engine(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
caplog: pytest.LogCaptureFixture,
|
||||
) -> None:
|
||||
# Arrange
|
||||
paths = _populate_cache(backbones, cache_root)
|
||||
tampered = paths["lightglue"]
|
||||
sidecar = Path(str(tampered) + ".sha256")
|
||||
sidecar.write_text(
|
||||
"0" * 64
|
||||
)
|
||||
runtime = _FakeRuntime(cache_root=cache_root)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(backbones, cache_root)
|
||||
|
||||
# Act
|
||||
with caplog.at_level(logging.WARNING, logger=logger.name):
|
||||
results = compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Assert
|
||||
outcome_by_name = {
|
||||
Path(r.entry.engine_path).stem.split("__")[0]: r.outcome
|
||||
for r in results
|
||||
}
|
||||
assert outcome_by_name["dinov2_vpr"] is CompileOutcome.REUSED
|
||||
assert outcome_by_name["lightglue"] is CompileOutcome.BUILT
|
||||
assert outcome_by_name["aliked"] is CompileOutcome.REUSED
|
||||
mismatch_kinds = [
|
||||
rec for rec in caplog.records
|
||||
if rec.__dict__.get("kind") == "c10.engine.sidecar.mismatch"
|
||||
]
|
||||
assert len(mismatch_kinds) == 1
|
||||
assert (
|
||||
mismatch_kinds[0].__dict__["kv"]["model_name"] == "lightglue"
|
||||
)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-6: ``EngineBuildError`` propagates without partial state corruption
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac6_engine_build_error_propagates_and_third_backbone_untouched(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
) -> None:
|
||||
# Arrange
|
||||
pre_populated = _populate_cache((backbones[0],), cache_root)
|
||||
runtime = _FakeRuntime(
|
||||
cache_root=cache_root,
|
||||
raise_on={"lightglue": EngineBuildError("CUDA OOM")},
|
||||
)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(backbones, cache_root)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(EngineBuildError, match="CUDA OOM"):
|
||||
compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Backbone 1 reused → untouched on disk
|
||||
assert pre_populated["dinov2_vpr"].exists()
|
||||
# Backbone 2 raised before write → no half-engine on disk
|
||||
aliked_filename = EngineFilenameSchema.build(
|
||||
model_name="aliked",
|
||||
sm=_HOST_T2.sm,
|
||||
jetpack=_HOST_T2.jetpack,
|
||||
trt=_HOST_T2.trt,
|
||||
precision="fp16",
|
||||
)
|
||||
assert not (cache_root / aliked_filename).exists()
|
||||
# Backbone 2 was attempted once; backbone 3 never reached
|
||||
assert [c[0].stem for c in runtime.calls] == ["lightglue"]
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-7: ``CalibrationCacheError`` propagates with diagnostic
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac7_calibration_cache_error_propagates(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
caplog: pytest.LogCaptureFixture,
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
calibration_path = tmp_path / "calib_dataset"
|
||||
calibration_path.mkdir(parents=True, exist_ok=True)
|
||||
runtime = _FakeRuntime(
|
||||
cache_root=cache_root,
|
||||
raise_on={
|
||||
"dinov2_vpr": CalibrationCacheError(
|
||||
"calibration table missing for INT8"
|
||||
)
|
||||
},
|
||||
)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(
|
||||
backbones,
|
||||
cache_root,
|
||||
precision=PrecisionMode.INT8,
|
||||
calibration_path=calibration_path,
|
||||
)
|
||||
|
||||
# Act + Assert
|
||||
with caplog.at_level(logging.ERROR, logger=logger.name):
|
||||
with pytest.raises(
|
||||
CalibrationCacheError, match="calibration table"
|
||||
):
|
||||
compiler.compile_engines_for_corpus(request)
|
||||
|
||||
error_kinds = [
|
||||
rec for rec in caplog.records
|
||||
if rec.__dict__.get("kind") == "c10.engine.compile.error"
|
||||
]
|
||||
assert len(error_kinds) == 1
|
||||
kv = error_kinds[0].__dict__["kv"]
|
||||
assert kv["model_name"] == "dinov2_vpr"
|
||||
assert kv["calibration_path"] == str(calibration_path)
|
||||
assert kv["error_class"] == "CalibrationCacheError"
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-8: filename + sidecar layout matches AZ-281 schema
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac8_filename_and_sidecar_layout(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
) -> None:
|
||||
# Arrange
|
||||
runtime = _FakeRuntime(cache_root=cache_root)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(backbones, cache_root)
|
||||
|
||||
# Act
|
||||
results = compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Assert
|
||||
dinov2 = next(
|
||||
r for r in results
|
||||
if Path(r.entry.engine_path).stem.startswith("dinov2_vpr")
|
||||
)
|
||||
assert (
|
||||
dinov2.entry.engine_path.name
|
||||
== "dinov2_vpr__sm87_jp6.2_trt10.3_fp16.engine"
|
||||
)
|
||||
sidecar = Path(str(dinov2.entry.engine_path) + ".sha256")
|
||||
assert sidecar.exists()
|
||||
assert len(sidecar.read_text().strip()) == 64
|
||||
parsed = EngineFilenameSchema.parse(dinov2.entry.engine_path.name)
|
||||
assert parsed.sm == 87
|
||||
assert parsed.jetpack == "6.2"
|
||||
assert parsed.trt == "10.3"
|
||||
assert parsed.precision == "fp16"
|
||||
assert Sha256Sidecar.verify(dinov2.entry.engine_path) is True
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-9: compile_duration_s recorded for ``built``, ``None`` for ``reused``
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac9_compile_duration_recorded_for_built_only(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
) -> None:
|
||||
# Arrange
|
||||
_populate_cache((backbones[0],), cache_root)
|
||||
runtime = _FakeRuntime(cache_root=cache_root)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(backbones, cache_root)
|
||||
|
||||
# Act
|
||||
results = compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Assert
|
||||
for r in results:
|
||||
if r.outcome is CompileOutcome.BUILT:
|
||||
assert r.compile_duration_s is not None
|
||||
assert r.compile_duration_s >= 0.0
|
||||
assert isinstance(r.compile_duration_s, float)
|
||||
else:
|
||||
assert r.compile_duration_s is None
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-10: empty backbones returns empty result with no side effects
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_ac10_empty_backbones_returns_empty(
|
||||
cache_root: Path,
|
||||
logger: logging.Logger,
|
||||
caplog: pytest.LogCaptureFixture,
|
||||
) -> None:
|
||||
# Arrange
|
||||
runtime = _FakeRuntime(cache_root=cache_root)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request((), cache_root)
|
||||
|
||||
# Act
|
||||
with caplog.at_level(logging.DEBUG, logger=logger.name):
|
||||
results = compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Assert
|
||||
assert results == ()
|
||||
assert runtime.calls == []
|
||||
assert list(cache_root.iterdir()) == []
|
||||
summary = [
|
||||
rec for rec in caplog.records
|
||||
if rec.__dict__.get("kind") == "c10.engine.compile.summary"
|
||||
]
|
||||
assert len(summary) == 1
|
||||
kv = summary[0].__dict__["kv"]
|
||||
assert kv["engines_built"] == 0
|
||||
assert kv["engines_reused"] == 0
|
||||
assert kv["total"] == 0
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# Sidecar-missing path (AC-5 sibling): engine on disk but no sidecar at all
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_missing_sidecar_treated_as_cache_miss(
|
||||
cache_root: Path,
|
||||
backbones: tuple[BackboneSpec, ...],
|
||||
logger: logging.Logger,
|
||||
caplog: pytest.LogCaptureFixture,
|
||||
) -> None:
|
||||
# Arrange
|
||||
paths = _populate_cache(backbones, cache_root)
|
||||
sidecar = Path(str(paths["lightglue"]) + ".sha256")
|
||||
sidecar.unlink()
|
||||
runtime = _FakeRuntime(cache_root=cache_root)
|
||||
compiler = EngineCompiler(inference_runtime=runtime, logger=logger)
|
||||
request = _request(backbones, cache_root)
|
||||
|
||||
# Act
|
||||
with caplog.at_level(logging.WARNING, logger=logger.name):
|
||||
results = compiler.compile_engines_for_corpus(request)
|
||||
|
||||
# Assert
|
||||
outcome_by_name = {
|
||||
Path(r.entry.engine_path).stem.split("__")[0]: r.outcome
|
||||
for r in results
|
||||
}
|
||||
assert outcome_by_name["lightglue"] is CompileOutcome.BUILT
|
||||
mismatch_kinds = [
|
||||
rec for rec in caplog.records
|
||||
if rec.__dict__.get("kind") == "c10.engine.sidecar.mismatch"
|
||||
]
|
||||
assert any(
|
||||
rec.__dict__["kv"]["model_name"] == "lightglue"
|
||||
for rec in mismatch_kinds
|
||||
)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# NFR placeholders (Tier-2 microbench harness owns these on Jetson)
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
|
||||
_TIER2_REASON = (
|
||||
"AZ-321 Tier-2 microbench harness owns the cache-hit and atomic-"
|
||||
"write NFR asserts (200 MB engine sweep, kill-during-compile "
|
||||
"scenarios); skipped on Tier-1 CI / macOS dev."
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.tier2
|
||||
def test_nfr_perf_cache_hit_p99_under_1500ms_for_200mb_engine() -> None:
|
||||
pytest.skip(_TIER2_REASON)
|
||||
|
||||
|
||||
@pytest.mark.tier2
|
||||
def test_nfr_reliability_atomic_write_no_half_engine_after_kill() -> None:
|
||||
pytest.skip(_TIER2_REASON)
|
||||
Reference in New Issue
Block a user