mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 17:11:14 +00:00
[AZ-299] C7 OnnxTrtEpRuntime: ORT + TRT EP fallback strategy
Land the fallback InferenceRuntime strategy that satisfies C7-IT-05: when the TRT-direct path (AZ-298) cannot deserialise a cached engine or when the operator explicitly selects ORT, the system stays in the air at degraded latency rather than dropping the request. Conforms to the AZ-297 Protocol; current_runtime_label() == "onnx_trt_ep". Production - onnx_trt_ep_runtime.py: compile_engine is a no-op returning an EngineCacheEntry pointing at the source .onnx; deserialize_engine is gate-first for .engine entries and gate-skip for .onnx, builds an ORT InferenceSession with the provider list [TensorrtExecutionProvider, CUDAExecutionProvider, CPUExecutionProvider], stages cached engines into the ORT TRT EP cache directory via symlink-or-copy, warms up with one session.run after construction, and honours config.inference.ort_disallow_cpu_ fallback by raising EngineDeserializeError when the active provider resolves to CPU; infer emits a one-shot c7.fallback_to_onnx_trt_ep WARN log plus gcs_alert callback on first call when is_fallback= True; release_engine is idempotent. _build_provider_args is the single point that pins TRT EP option-key names (Risk-3) and caps trt_max_workspace_size at gpu_memory_budget_bytes // 4 (AC-8). - config.py: adds ort_trt_cache_dir (validated non-empty) and ort_disallow_cpu_fallback to C7InferenceConfig. - fdr_client/records.py: adds c7.fallback_to_onnx_trt_ep and c7.cpu_fallback FDR record kinds. Tests - test_onnx_trt_ep_runtime.py: covers AC-1..AC-8 + Risk-2 CPU-fallback alert + Risk-3 option-key pin + NFR-reliability error rewrap; Tier-1 via fake ORT session; Tier-2 placeholders skip on macOS dev for numerical FP16 comparison and session-creation perf NFR. - test_protocol_conformance.py: drops onnx_trt_ep from the missing- module parametrize now that the module ships. - test_az272_fdr_record_schema.py: extends per-kind fixture builder to cover the two new C7 FDR kinds in the roundtrip / schema-version AC tests. Docs - module-layout.md: replaces the pending onnx_trt_runtime row with the shipped onnx_trt_ep_runtime row + capabilities list. - batch_32_cycle1_report.md + reviews/batch_32_review.md: full batch + self-review (PASS_WITH_WARNINGS, 4 Low findings accepted). Tests run: c7_inference 139 passing + 17 Tier-2 skips; combined unit suite (excluding pending components) 529 passing, 19 env-skipped. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,54 @@
|
||||
# Code Review Report — Batch 32 / Cycle 1
|
||||
|
||||
**Batch**: 32
|
||||
**Tasks**: AZ-299 (C7 OnnxTrtEpRuntime)
|
||||
**Date**: 2026-05-12
|
||||
**Verdict**: PASS_WITH_WARNINGS
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title |
|
||||
|---|----------|----------|-----------|-------|
|
||||
| 1 | Low | Maintainability | `src/gps_denied_onboard/components/c7_inference/onnx_trt_ep_runtime.py::_iso_ts_now` | Duplicates the equivalent helper in `tensorrt_runtime.py` / `fdr_client` |
|
||||
| 2 | Low | Test-quality | `tests/unit/c7_inference/test_onnx_trt_ep_runtime.py::test_ac4_infer_round_trips_named_outputs` | Round-trip verified via fake ORT session; numerical FP16 comparison lives in Tier-2 microbench |
|
||||
| 3 | Low | Architecture | `onnx_trt_ep_runtime.py::_stage_engine_for_ort` | Symlink-first with copy fallback can leave a torn copy on disk if interrupted mid-copy |
|
||||
| 4 | Low | Test-coverage | AC-3 schema-mismatch path | Real gate filename-schema parser exercised in AZ-281 / AZ-301; this test stubs `EngineGate.validate` |
|
||||
|
||||
### Finding Details
|
||||
|
||||
**F1: `_iso_ts_now` duplicated component-locally** (Low / Maintainability)
|
||||
- Location: `src/gps_denied_onboard/components/c7_inference/onnx_trt_ep_runtime.py` — module-level helper
|
||||
- Description: A one-liner ISO-8601 timestamp helper, also present in `tensorrt_runtime.py` and `fdr_client`. Consolidating would either inflate `fdr_client/records.py` (the lowest-layer module the C7 strategies depend on, currently free of utility functions) or carve out a shared utility module for a single one-liner.
|
||||
- Suggestion: Extract alongside other shared ISO-timestamp call sites in a future hygiene pass (likely when `_types/` grows enough to justify a shared `_utils/` neighbour). For now the C7 layering rule wins.
|
||||
- Task: AZ-299
|
||||
- Resolution: Open (Low) — accepted as documented.
|
||||
|
||||
**F2: Round-trip via fake ORT session** (Low / Test-quality)
|
||||
- Location: `tests/unit/c7_inference/test_onnx_trt_ep_runtime.py::test_ac4_infer_round_trips_named_outputs`
|
||||
- Description: Uses `_FakeOrtSession.run(...)` returning canned arrays in the declared output order. The named-output mapping is verified at the Protocol layer; the *numerical* FP16 comparison against TRT-direct (the second half of AC-4) is a Tier-2 placeholder skip that runs on the Jetson microbench harness.
|
||||
- Suggestion: None — the Tier-1 / macOS dev environment lacks ORT + CUDA. The Tier-2 placeholder owns the numerical half explicitly.
|
||||
- Task: AZ-299
|
||||
- Resolution: Open (Low) — accepted as documented.
|
||||
|
||||
**F3: `_stage_engine_for_ort` symlink-then-copy** (Low / Architecture)
|
||||
- Location: `onnx_trt_ep_runtime.py::_stage_engine_for_ort`
|
||||
- Description: The helper tries `os.symlink(...)` first and falls back to `shutil.copy2(...)` on `OSError`. If the copy is interrupted partway, the EP cache directory ends up with a torn file. The runtime does NOT validate the staged file post-copy.
|
||||
- Suggestion: A torn copy left in the EP cache is no worse than a stale subgraph (ORT rebuilds on hash mismatch). C12's per-flight cache cleanup wipes the directory between flights, so the failure window is bounded to a single flight's deserialise attempts.
|
||||
- Task: AZ-299
|
||||
- Resolution: Open (Low) — accepted as documented; C12 owns cleanup.
|
||||
|
||||
**F4: AC-3 stubs `EngineGate.validate`** (Low / Test-coverage)
|
||||
- Location: `tests/unit/c7_inference/test_onnx_trt_ep_runtime.py::test_ac3_deserialize_from_engine_invokes_gate_and_skips_session_on_refusal`
|
||||
- Description: Patches `EngineGate.validate` to raise `EngineSchemaMismatchError`; the real filename-schema parser lives behind AZ-281 + AZ-301 and is exercised by their respective unit tests.
|
||||
- Suggestion: Wiring this runtime to a live gate would duplicate AZ-301's coverage at the wrong layer. Keep the stub here; the runtime's contract with the gate is "if it raises, propagate without touching ORT" — verified by the `_load_ort` monkey-patch that asserts no ORT import on the refusal path.
|
||||
- Task: AZ-299
|
||||
- Resolution: Open (Low) — accepted as documented.
|
||||
|
||||
## Verdict Logic
|
||||
|
||||
- 0 Critical
|
||||
- 0 High
|
||||
- 0 Medium
|
||||
- 4 Low
|
||||
|
||||
→ **PASS_WITH_WARNINGS**: only Low findings; all accepted as documented.
|
||||
Reference in New Issue
Block a user