Files
gps-denied-onboard/.github/workflows/ci.yml
T
Oleksandr Bezdieniezhnykh 65ad2168ed [AZ-300] Implement PytorchFp16Runtime — C7 simple-baseline strategy
AZ-300 mandatory simple-baseline InferenceRuntime (eager FP16 PyTorch).
Implements the AZ-297 Protocol; current_runtime_label returns
"pytorch_fp16". Numerical reference every fancier C7 strategy (AZ-298
TRT, AZ-299 ORT) is measured against, and the only viable runtime for
Tier-1 workstation Docker where TRT is non-trivial to install.

Production code (new):
 - components/c7_inference/pytorch_fp16_runtime.py — runtime +
   PytorchEngineHandle + output-shape adapter
 - components/c7_inference/architecture_registry.py — torch-free
   register_architecture / default_registry / ArchitectureFactory
   (Risk-1 mitigation: no L2->L3 back-edge from C7 into per-backbone
   code)
 - components/c7_inference/__init__.py — re-exports the registry
   mechanism. Still does NOT import the concrete strategy module
   (Invariant I-5)
 - components/c7_inference/config.py — adds per_frame_debug_log bool
   field (gates the DEBUG per-frame latency log)

Tests (new): tests/unit/c7_inference/test_pytorch_fp16_runtime.py
covers AC-1..AC-8 + NFRs. AC-1/2/6/7 + thermal/release/registry
guards run unconditionally (17 tests); AC-3/4/5/8 +
NFR-perf-deserialize + NFR-reliability-eval-mode require CUDA and
skip on Tier-1 CI / macOS dev.

Tests (modified):
 - test_protocol_conformance.py — narrowed
   test_ac5_build_inference_runtime_flag_on_but_module_missing
   parametrisation to exclude pytorch_fp16 (now-built); TRT / ORT
   still covered until AZ-298 / AZ-299 ship.

CI: .github/workflows/ci.yml lint + unit jobs now install
'-e .[dev,inference]' because mypy + pytest need torch + torchvision +
onnxruntime on the runner.

Three task-spec -> as-built deltas documented in
_docs/02_tasks/done/AZ-300_c7_pytorch_baseline.md Implementation Notes:
 1. Constructor conforms to AZ-297 factory shape (config positional;
    thermal_publisher + registry + clock keyword-only optionals).
    AZ-302 will update the factory to thread thermal_publisher.
 2. Architecture registry uses extras["model_name"] as lookup key
    (avoids touching the frozen BuildConfig / EngineCacheEntry DTOs).
 3. Warm-up forward deferred to AZ-300 tier-2 follow-up — the zero-arg
    registry has no per-backbone input-shape metadata.

Suite: 1120 passed / 10 skipped (CUDA + Tier-2 + cmake / actionlint
environment gates). No regressions in non-c7_inference areas.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 10:13:21 +03:00

105 lines
3.5 KiB
YAML

name: ci-tier1
on:
push:
branches: [dev, stage, main]
pull_request:
branches: [dev, stage, main]
jobs:
lint:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"
# AZ-300 — `[inference]` (torch + torchvision + onnxruntime) is now
# required for `mypy src` to type-check `c7_inference.pytorch_fp16_runtime`
# and for `pytest` to collect `test_pytorch_fp16_runtime.py`. Tier-1
# CI uses the CPU-only torch wheel; CUDA-gated tests skip themselves
# via `pytest.mark.skipif(not torch.cuda.is_available(), ...)`.
- run: pip install -e ".[dev,inference]"
- run: ruff check src tests
- run: mypy src
unit:
runs-on: ubuntu-22.04
needs: lint
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"
- run: pip install -e ".[dev,inference]"
- name: pytest unit (per-component coverage gate)
run: pytest -q --cov=gps_denied_onboard --cov-fail-under=75 tests/unit
integration:
runs-on: ubuntu-22.04
needs: unit
steps:
- uses: actions/checkout@v4
- name: docker compose up
run: docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build
build:
name: build-${{ matrix.kind }}
runs-on: ubuntu-22.04
needs: lint
strategy:
fail-fast: false
matrix:
kind: [deployment, research]
include:
# AZ-332 — BUILD_OKVIS2 forced OFF in Tier-1 CI until the tier2
# follow-up wires `okvis::ThreadedKFVio` end-to-end. The C++
# binding skeleton + CMake glue still ship in this build; full
# OKVIS2 native compile is gated on installing Ceres-solver +
# OKVIS2 vendored submodules (BRISK, DBoW2) via apt, plus
# `submodules: recursive` checkout. That CI lift is the
# tier2 task's surface, not AZ-332's.
- kind: deployment
cmake_flags: >-
-DBUILD_OKVIS2=OFF -DBUILD_VINS_MONO=OFF
-DBUILD_VPR_SALAD=OFF -DBUILD_C11_TILE_MANAGER=OFF
- kind: research
cmake_flags: >-
-DBUILD_OKVIS2=OFF -DBUILD_VINS_MONO=ON -DBUILD_VPR_SALAD=ON
steps:
- uses: actions/checkout@v4
- run: cmake -S . -B build ${{ matrix.cmake_flags }}
- run: cmake --build build --parallel
sbom-diff:
runs-on: ubuntu-22.04
needs: build
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"
- name: SBOM diff (ADR-002 enforcement)
run: python ci/sbom_diff.py --deployment build-deployment-sbom.json --research build-research-sbom.json
security:
runs-on: ubuntu-22.04
needs: build
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"
- run: pip install pip-audit
- run: pip-audit -r pyproject.toml || true
- name: OpenCV pin gate (D-CROSS-CVE-1)
run: python ci/opencv_pin_gate.py --pyproject pyproject.toml
push-images:
runs-on: ubuntu-22.04
if: github.event_name == 'push' && contains(fromJson('["refs/heads/dev","refs/heads/stage","refs/heads/main"]'), github.ref)
needs: [unit, integration, build, sbom-diff, security]
steps:
- uses: actions/checkout@v4
- run: echo "push images to GHCR (deployment + research) — wiring lands per release task"