Files
gps-denied-onboard/_docs/03_implementation/batch_25_cycle1_report.md
T
Oleksandr Bezdieniezhnykh 59f56c032f [AZ-301] Implement EngineGate — D-C10-3 + D-C10-7 takeoff validator
AZ-301 takeoff-side validator every InferenceRuntime strategy calls
before deserialize_engine. Five-step deterministic refusal pipeline,
in order:

  1. filename schema parse  -> EngineSchemaMismatchError(reason=...)
  2. schema tuple match     -> EngineSchemaMismatchError(expected,got)
  3. sidecar present        -> EngineSidecarMissingError
  4. sidecar trust          -> EngineHashMismatchError(stage=sidecar)
  5. manifest match         -> EngineHashMismatchError(stage=manifest)

Refusal order is part of the public contract (AC-7 verifies a
fixture that is BOTH schema-mismatched AND missing-sidecar refuses
at step 1).

Production code (new):
 - components/c7_inference/engine_gate.py  -- EngineGate, HostTuple,
   read_host_tuple (Jetson: pynvml + /etc/nv_tegra_release +
   tensorrt.__version__; raises RuntimeError on Tier-1)
 - components/c7_inference/manifest.py     -- DeploymentManifest,
   ManifestReader, ManifestReaderProtocol. Risk-2 enforced at the
   type level: __getitem__ raises EngineHashMismatchError on
   missing key, NEVER KeyError, so the gate cannot silently pass
 - components/c7_inference/__init__.py     -- re-exports the new
   public surface

Tests (new): tests/unit/c7_inference/test_engine_gate.py covers
AC-1..AC-7 + NFR-reliability-no-write + manifest reader + refusal
log emission. 14 tests unconditional + AC-8 Tier-2 skip (needs
real NVML + L4T release file + tensorrt binding).

Three task-spec -> as-built deltas documented in
_docs/02_tasks/done/AZ-301_c7_engine_gate.md Implementation Notes:
 1. HostTuple lives in engine_gate.py (the only consumer);
    re-exported from package __init__.py.
 2. read_host_tuple takes precision as a keyword argument — three
    of four fields come from the host, precision is engine-build
    metadata supplied by the caller.
 3. AC-8 is Tier-2-only; AC-1..AC-7 + NFR-reliability + extras
    run on every CI host.

Risk-2 (manifest reader silently treats missing entry as pass):
DeploymentManifest.__getitem__ raises EngineHashMismatchError with
"missing manifest entry for {path}" — covered by
test_manifest_missing_entry_raises_hash_mismatch.

NFR-perf-validate (p99 <= 50 ms): tier-2 only — a real 500 MB
engine streaming sha256 cannot be benchmarked on Tier-1 fixtures.

AZ-302 (ThermalStatePublisher) + AZ-304 (C6 Postgres schema)
deferred to batches 26 / 27 to keep the 1-task batch cadence and
isolate their respective env / testcontainer surface areas.

Suite: 1134 passed / 11 skipped. No regressions outside the new
files.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 10:20:21 +03:00

6.5 KiB

Batch 25 / Cycle 1 — Implementation Report

Date: 2026-05-12 Tasks: AZ-301 (C7 EngineGate — D-C10-3 + D-C10-7 takeoff validator) Story points landed: 3 Status: complete (AZ-301 → In Testing)

Scope summary

Single-task batch, continuing the post-AZ-300 1-task cadence. The user selected AZ-301 + AZ-302 + (optionally) AZ-304 for batch 25; AZ-302 (ThermalStatePublisher, 3pt, requires jtop / pynvml integration + background thread) and AZ-304 (C6 Postgres schema, 2pt, requires testcontainers + Alembic) each carry meaningful surface area; bundling them with AZ-301 would push the batch past the project's 1-task cadence and bloat the commit beyond practical review. Each ships as its own batch (26 / 27).

Files added / modified

New

  • src/gps_denied_onboard/components/c7_inference/engine_gate.pyEngineGate.validate five-step deterministic pipeline + HostTuple frozen dataclass + read_host_tuple() Jetson helper (lazy NVML / L4T / TRT-version reads).
  • src/gps_denied_onboard/components/c7_inference/manifest.pyDeploymentManifest (Risk-2-compliant __getitem__ that raises EngineHashMismatchError on missing key) + ManifestReader on-disk JSON loader + ManifestReaderProtocol for test fakes.
  • tests/unit/c7_inference/test_engine_gate.py — 15 tests covering AC-1..AC-7 + NFR-reliability-no-write + manifest-reader coverage + refusal-log emission. AC-8 (real Jetson NVML) is Tier-2-only and skips via GPS_DENIED_TIER env gate.

Modified

  • src/gps_denied_onboard/components/c7_inference/__init__.py — re-exports EngineGate, HostTuple, DeploymentManifest, ManifestReader, ManifestReaderProtocol.
  • _docs/02_tasks/todo/AZ-301_c7_engine_gate.md → moved to _docs/02_tasks/done/; added ## Implementation Notes (2026-05-12, batch 25) documenting the three task-spec → as-built deltas (HostTuple module location, explicit precision arg on read_host_tuple, AC-8 Tier-2 skip).

Design decisions

  1. HostTuple module location: co-located with engine_gate.py (the only consumer); re-exported from package __init__.py. Spec left it unpinned. Risk: minor lift if future component needs it directly — acceptable since the public surface lives in the package.
  2. read_host_tuple(*, precision) keyword argument: the helper reads three of the four tuple fields from the host (NVML → sm; /etc/nv_tegra_releasejp; tensorrt.__version__trt). precision is engine-build metadata, not a host property — passed by the caller. Matches the spec's "derived from nvidia-smi/pynvml + the runtime's pinned TRT version + the engine's intended precision" clause.
  3. Risk-2 enforcement in DeploymentManifest.__getitem__: missing-key access raises EngineHashMismatchError directly (not KeyError). Eliminates the silent-pass class of bug at the type level — any consumer using the manifest must handle the C7-family error, never KeyError.

AC coverage

AC Status Notes
AC-1 schema parse failure covered test_ac1_parse_failure_refused_at_parse_time
AC-2 schema tuple mismatch covered test_ac2_schema_tuple_mismatch (sm 86 vs host 87)
AC-3 missing sidecar covered test_ac3_missing_sidecar_refused_before_manifest
AC-4 sidecar trust covered test_ac4_sidecar_hash_mismatches_file
AC-5 manifest mismatch covered test_ac5_manifest_hash_mismatches_sidecar
AC-6 happy path + INFO log covered test_ac6_full_success_returns_silently_and_logs_pass
AC-7 schema wins over sidecar covered test_ac7_schema_error_wins_over_sidecar_missing
AC-8 read_host_tuple on Jetson tier2 test_ac8_read_host_tuple_on_jetson@pytest.mark.tier2 + GPS_DENIED_TIER!=2 skip
NFR-perf-validate (≤ 50 ms) tier2 Real engine-size benchmarks belong on Jetson
NFR-reliability-no-write covered test_nfr_reliability_no_writes snapshots mtime + bytes + sidecar text pre/post-validate

Additional coverage beyond ACs:

  • test_manifest_reader_round_trip — JSON ⇄ DTO round-trip.
  • test_manifest_missing_entry_raises_hash_mismatch — Risk-2 (the critical "no silent pass" property).
  • test_manifest_reader_rejects_malformed_json / test_manifest_reader_rejects_missing_entries_key — bad-input refusal at parse time.
  • test_engine_outside_manifest_root_refusedengine_path not under manifest.root raises EngineHashMismatchError.
  • test_refusal_emits_error_logc7.gate.refuse ERROR log emitted with step + reason fields.

Test run

.venv/bin/pytest tests/unit/c7_inference/  → 77 passed,  7 skipped
.venv/bin/pytest                            → 1134 passed, 11 skipped

Skips are environment-gated (CUDA for AZ-300, Tier-2 GPS_DENIED_TIER for AZ-301 AC-8, cmake + actionlint absent on dev). No pre-existing tests regressed.

Self-review verdict

Pass. Pure validator, no GPU ops, no writes. Five refusal paths in the documented order; AC-7 verifies the discipline. Risk-2 raised at the type level via DeploymentManifest.__getitem__.

Known gaps for the Product Implementation Completeness Gate

  • AC-8 / NFR-perf-validate not validated on dev: needs Tier-2 Jetson. The CI matrix's runs-on: ubuntu-22.04 cannot exercise these — same gap pattern as AZ-300's CUDA-gated AC-3/4/5/8 and AZ-332's tier-2 marker.
  • read_host_tuple failure modes are minimally tested: NVML init failure / unrecognised L4T release / missing tensorrt binding all raise RuntimeError, but the unconditional test suite cannot exercise the success path. Future Tier-2 integration test should pin behaviour.
  • Manifest schema is owned by E-C10: this task ships only the reader. The writer (CacheProvisioner) is a separate task; until it lands, integration testing of the gate uses dict-backed manifest fixtures.

Next batch

Batch 26 candidates:

  • AZ-302 (ThermalStatePublisher, 3pt) — background thread + jtop / pynvml + FDR transition records. Requires test-side fake sources (jtop + pynvml may need to be added to pyproject extras for Tier-1 CI; even with fakes, the import-attempt logic in the publisher will fail without the modules — handle via lazy import).
  • AZ-304 (C6 Postgres schema, 2pt) — Alembic migration + testcontainers Postgres 16 + schema-shape fixture diff test.
  • 17 tasks total ready in the queue (AZ-300 + AZ-301 removed).

Recommended batch 26 size: 1 task (continue post-AZ-300 cadence). AZ-302's surface (8 ACs + threading + lazy-import gates) suggests it ship alone.