Files
gps-denied-onboard/_docs/03_implementation/reviews/batch_32_review.md
T
Oleksandr Bezdieniezhnykh 0ad3278b12 [AZ-299] C7 OnnxTrtEpRuntime: ORT + TRT EP fallback strategy
Land the fallback InferenceRuntime strategy that satisfies C7-IT-05:
when the TRT-direct path (AZ-298) cannot deserialise a cached engine
or when the operator explicitly selects ORT, the system stays in the
air at degraded latency rather than dropping the request. Conforms to
the AZ-297 Protocol; current_runtime_label() == "onnx_trt_ep".

Production
- onnx_trt_ep_runtime.py: compile_engine is a no-op returning an
  EngineCacheEntry pointing at the source .onnx; deserialize_engine
  is gate-first for .engine entries and gate-skip for .onnx, builds
  an ORT InferenceSession with the provider list
  [TensorrtExecutionProvider, CUDAExecutionProvider,
  CPUExecutionProvider], stages cached engines into the ORT TRT EP
  cache directory via symlink-or-copy, warms up with one session.run
  after construction, and honours config.inference.ort_disallow_cpu_
  fallback by raising EngineDeserializeError when the active provider
  resolves to CPU; infer emits a one-shot c7.fallback_to_onnx_trt_ep
  WARN log plus gcs_alert callback on first call when is_fallback=
  True; release_engine is idempotent. _build_provider_args is the
  single point that pins TRT EP option-key names (Risk-3) and caps
  trt_max_workspace_size at gpu_memory_budget_bytes // 4 (AC-8).
- config.py: adds ort_trt_cache_dir (validated non-empty) and
  ort_disallow_cpu_fallback to C7InferenceConfig.
- fdr_client/records.py: adds c7.fallback_to_onnx_trt_ep and
  c7.cpu_fallback FDR record kinds.

Tests
- test_onnx_trt_ep_runtime.py: covers AC-1..AC-8 + Risk-2 CPU-fallback
  alert + Risk-3 option-key pin + NFR-reliability error rewrap; Tier-1
  via fake ORT session; Tier-2 placeholders skip on macOS dev for
  numerical FP16 comparison and session-creation perf NFR.
- test_protocol_conformance.py: drops onnx_trt_ep from the missing-
  module parametrize now that the module ships.
- test_az272_fdr_record_schema.py: extends per-kind fixture builder
  to cover the two new C7 FDR kinds in the roundtrip / schema-version
  AC tests.

Docs
- module-layout.md: replaces the pending onnx_trt_runtime row with
  the shipped onnx_trt_ep_runtime row + capabilities list.
- batch_32_cycle1_report.md + reviews/batch_32_review.md: full batch
  + self-review (PASS_WITH_WARNINGS, 4 Low findings accepted).

Tests run: c7_inference 139 passing + 17 Tier-2 skips; combined unit
suite (excluding pending components) 529 passing, 19 env-skipped.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 23:55:50 +03:00

4.0 KiB

Code Review Report — Batch 32 / Cycle 1

Batch: 32 Tasks: AZ-299 (C7 OnnxTrtEpRuntime) Date: 2026-05-12 Verdict: PASS_WITH_WARNINGS

Findings

# Severity Category File:Line Title
1 Low Maintainability src/gps_denied_onboard/components/c7_inference/onnx_trt_ep_runtime.py::_iso_ts_now Duplicates the equivalent helper in tensorrt_runtime.py / fdr_client
2 Low Test-quality tests/unit/c7_inference/test_onnx_trt_ep_runtime.py::test_ac4_infer_round_trips_named_outputs Round-trip verified via fake ORT session; numerical FP16 comparison lives in Tier-2 microbench
3 Low Architecture onnx_trt_ep_runtime.py::_stage_engine_for_ort Symlink-first with copy fallback can leave a torn copy on disk if interrupted mid-copy
4 Low Test-coverage AC-3 schema-mismatch path Real gate filename-schema parser exercised in AZ-281 / AZ-301; this test stubs EngineGate.validate

Finding Details

F1: _iso_ts_now duplicated component-locally (Low / Maintainability)

  • Location: src/gps_denied_onboard/components/c7_inference/onnx_trt_ep_runtime.py — module-level helper
  • Description: A one-liner ISO-8601 timestamp helper, also present in tensorrt_runtime.py and fdr_client. Consolidating would either inflate fdr_client/records.py (the lowest-layer module the C7 strategies depend on, currently free of utility functions) or carve out a shared utility module for a single one-liner.
  • Suggestion: Extract alongside other shared ISO-timestamp call sites in a future hygiene pass (likely when _types/ grows enough to justify a shared _utils/ neighbour). For now the C7 layering rule wins.
  • Task: AZ-299
  • Resolution: Open (Low) — accepted as documented.

F2: Round-trip via fake ORT session (Low / Test-quality)

  • Location: tests/unit/c7_inference/test_onnx_trt_ep_runtime.py::test_ac4_infer_round_trips_named_outputs
  • Description: Uses _FakeOrtSession.run(...) returning canned arrays in the declared output order. The named-output mapping is verified at the Protocol layer; the numerical FP16 comparison against TRT-direct (the second half of AC-4) is a Tier-2 placeholder skip that runs on the Jetson microbench harness.
  • Suggestion: None — the Tier-1 / macOS dev environment lacks ORT + CUDA. The Tier-2 placeholder owns the numerical half explicitly.
  • Task: AZ-299
  • Resolution: Open (Low) — accepted as documented.

F3: _stage_engine_for_ort symlink-then-copy (Low / Architecture)

  • Location: onnx_trt_ep_runtime.py::_stage_engine_for_ort
  • Description: The helper tries os.symlink(...) first and falls back to shutil.copy2(...) on OSError. If the copy is interrupted partway, the EP cache directory ends up with a torn file. The runtime does NOT validate the staged file post-copy.
  • Suggestion: A torn copy left in the EP cache is no worse than a stale subgraph (ORT rebuilds on hash mismatch). C12's per-flight cache cleanup wipes the directory between flights, so the failure window is bounded to a single flight's deserialise attempts.
  • Task: AZ-299
  • Resolution: Open (Low) — accepted as documented; C12 owns cleanup.

F4: AC-3 stubs EngineGate.validate (Low / Test-coverage)

  • Location: tests/unit/c7_inference/test_onnx_trt_ep_runtime.py::test_ac3_deserialize_from_engine_invokes_gate_and_skips_session_on_refusal
  • Description: Patches EngineGate.validate to raise EngineSchemaMismatchError; the real filename-schema parser lives behind AZ-281 + AZ-301 and is exercised by their respective unit tests.
  • Suggestion: Wiring this runtime to a live gate would duplicate AZ-301's coverage at the wrong layer. Keep the stub here; the runtime's contract with the gate is "if it raises, propagate without touching ORT" — verified by the _load_ort monkey-patch that asserts no ORT import on the refusal path.
  • Task: AZ-299
  • Resolution: Open (Low) — accepted as documented.

Verdict Logic

  • 0 Critical
  • 0 High
  • 0 Medium
  • 4 Low

PASS_WITH_WARNINGS: only Low findings; all accepted as documented.