Files
gps-denied-onboard/_docs/02_document/deployment/ci_cd_pipeline.md
T
Oleksandr Bezdieniezhnykh 64542d32fc Update autodev state, architecture documentation, and glossary terms
Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
2026-05-10 00:21:34 +03:00

13 KiB
Raw Blame History

GPS-Denied Onboard — CI/CD Pipeline

Date: 2026-05-09 (Plan Phase 2c — initial draft). Inputs: _docs/02_document/architecture.md § 3 (Deployment Model); ADR-002 (build-time exclusion); ADR-005 (Tier-1 / Tier-2 are first-class); ADR-007 (mock-suite-sat-service is an e2e-test fixture; reversed 2026-05-09 from the earlier "real component boundary" framing).

Pipeline Overview

The pipeline has two execution tiers (architecture.md ADR-005), reflected in two CI runner pools that share the same workflow definitions but differ in runner labels and active job set:

Stage Trigger Runner Quality Gate
Lint Every push, every PR Tier-1 (GitHub-hosted x86_64) Zero lint errors (Python: ruff + mypy --strict; C++: clang-format --dry-run + clang-tidy; CMake: cmakelang)
Unit Every push, every PR Tier-1 All unit tests pass; coverage ≥ 75 % per component, ≥ 90 % on safety-critical (C5 state estimator, C8 FC adapters)
Integration (Tier-1) Every push, every PR Tier-1 Tier-1 integration suite passes (uses docker-compose.test.yml — companion + mock-sat + db + e2e-runner)
Build (Tier-1, both binaries) Every push, every PR Tier-1 companion-tier1:deployment-<sha> AND companion-tier1:research-<sha> build green (ADR-002 dual-emit)
SBOM diff After build Tier-1 Deployment SBOM excludes vins_mono, salad, etc.; research SBOM includes all strategies; PR fails on mismatch
Security After build Tier-1 Zero unpatched critical / high CVEs (pip-audit + dotnet list package --vulnerable for mock-sat + Trivy on images)
Push images (Tier-1) PR merge to dev, stage, main Tier-1 Push succeeds; PRs do NOT push (avoids polluting registry)
Build (Tier-2 deployment binary) PR merge to dev, stage, main Tier-2 (self-hosted Jetson) Native build on Jetson green; deployment binary SBOM matches Tier-1 deployment SBOM
AC-bound NFTs (Tier-2) PR merge to dev, stage, main; manual on PR Tier-2 NFT-PERF-* (AC-4.1, AC-NEW-1, AC-NEW-2), NFT-LIM-* (AC-4.2, AC-NEW-3), NFT-RES-* (AC-NEW-4, AC-NEW-7), IT-12 (comparative study) all pass thresholds in tests/traceability-matrix.md
JetPack image build Tag on main Tier-2 JetPack 6.2 image built with deployment binary preinstalled, signed, and attested
Operator tooling tarball Tag on main Tier-1 Tarball contains C11 Tile Manager (both TileDownloader and TileUploader) + C12 Operator Pre-flight Tooling + mock-sat-service compose + verification script

Tier-2 jobs are the only AC-bound jobs. Everything else runs on Tier-1.

Stage Details

Lint

Parallelized per language inside one Tier-1 workflow. Sequential per file is preserved in the report so a single failure is greppable in the log.

Language Tool Rules
Python ruff (formatter + linter) Project's pyproject.toml configures rules; ruff check --diff enforces that the committed code is formatted
Python types mypy --strict Strict mode; all components must type-check (CI fails on error: ...)
C++ clang-format --dry-run + clang-tidy .clang-format lives at repo root; clang-tidy checks listed in .clang-tidy
CMake cmakelang (cmake-format --check) .cmake-format.yaml lives at repo root
YAML / Markdown yamllint, markdownlint-cli Used for .github/, _docs/, docker-compose*.yml

Unit

Component Framework Coverage gate
Python (host code) pytest + pytest-cov --cov-fail-under=75 per component; safety-critical (C5, C8) at --cov-fail-under=90
C++ (per-strategy native builds) gtest + lcov Per-strategy library ≥ 75 % line coverage; klt_ransac (mandatory simple-baseline) at ≥ 90 %
Mock sat service (.NET) dotnet test + coverlet ≥ 75 % line coverage on the mock

Coverage report is published as a pipeline artifact (coverage/index.html). CI fails fast on threshold violation.

Integration (Tier-1)

Drives the autodev e2e contract: runs docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build from e2e/ and captures e2e/results/report.csv.

Coverage scenarios on Tier-1:

  • All FT (Functional Test) and IT (Integration Test) scenarios that DO NOT require Jetson hardware (per tests/traceability-matrix.md "Tier" column).
  • mock-suite-sat-service interactions including failure injection (latency, 5xx, partial responses, cache poisoning replay).
  • Cross-FC adapter behavior on SITL: ArduPilot Plane SITL runs as a sidecar container; iNav SITL runs as a sidecar container; companion's MAVLink and MSP2 paths are exercised against both.
  • D-PROJ-2 contract: post-landing upload payload assembly + signature verification against the mock.

Build (Tier-1, both binaries)

Per ADR-002, every PR produces both binaries. The build job uses two parallel matrix entries with identical Dockerfile + different BUILD_* flags:

matrix:
  build_kind:
    - { tag: deployment, args: "BUILD_VINS_MONO=OFF BUILD_SALAD=OFF" }
    - { tag: research,   args: "BUILD_VINS_MONO=ON  BUILD_SALAD=ON" }

The Dockerfile receives the args; cmake -DBUILD_VINS_MONO=$BUILD_VINS_MONO -DBUILD_SALAD=$BUILD_SALAD enforces the exclusion at the C++ build layer; setup.py / pyproject.toml reads the same env to skip importing excluded modules in the composition root validator. Both images are built; both must build green; both go through SBOM and security gates.

SBOM diff (ADR-002 enforcement)

- name: sbom-deployment
  run: syft packages docker:gps-denied/companion-tier1:deployment-${{ github.sha }} -o spdx-json > sbom-deployment.json

- name: sbom-research
  run: syft packages docker:gps-denied/companion-tier1:research-${{ github.sha }} -o spdx-json > sbom-research.json

- name: sbom-diff
  run: python ci/sbom_diff.py --deployment sbom-deployment.json --research sbom-research.json

ci/sbom_diff.py enforces:

  • vins_mono, salad, and any module flagged "research-only" in _docs/02_document/components/ MUST appear in research SBOM and MUST NOT appear in deployment SBOM.
  • The deployment SBOM is a strict subset of the research SBOM (i.e., the research binary contains everything the deployment binary contains plus the research-only modules).
  • Both SBOMs are attached as workflow artifacts and as release artifacts on tag.

Security

Check Tool Block on
Python dependency CVEs pip-audit against pyproject.toml lockfile Critical / High severity
.NET dependency CVEs dotnet list package --vulnerable --include-transitive Critical / High severity
C++ dependency CVEs Manual audit via SBOM matched against NVD; osv-scanner for known submodule pins Critical / High severity
Image scan Trivy on all CI-built images Critical / High severity
OpenCV pin gate CI step asserts the resolved OpenCV version is ≥ 4.12.0 (D-CROSS-CVE-1) Any version < 4.12.0
GTSAM CVE re-scan Monthly scheduled workflow against the GTSAM commit pinned in cmake/dependencies.cmake Any new published CVE

Push images (Tier-1)

On push to dev, stage, main: tag images with ${BRANCH_NAME}-${BUILD_KIND}-${SHORT_SHA} and push to the registry. PR events do NOT push — PRs get test signal only.

Build (Tier-2 deployment binary)

Self-hosted Jetson runner (labels: [self-hosted, jetson, orin-nano-super]) builds the deployment binary natively. The build is not containerized (architecture.md § 3 explanation). After build:

  1. Compute the deployment-binary SBOM on Jetson.
  2. Compare it byte-for-byte (after canonicalization) against the Tier-1 deployment-binary SBOM. If they diverge, the PR fails — the two binaries must be built from the same source / same dependency pins.
  3. Cache the TRT engine builds on the Jetson runner's persistent cache (keyed by manifest hash) so subsequent CI runs reuse them.

AC-bound NFTs (Tier-2)

Run only on the Tier-2 runner. Each NFT corresponds to one or more acceptance-criterion entries in tests/traceability-matrix.md. The runner:

  1. Pulls the freshly-built deployment binary.
  2. Mounts the curated tests/fixtures/flight_derkachi/ replay corpus.
  3. Runs each NFT scenario, captures jetson-stats telemetry (CPU, GPU, temp, throttle, RAM, VRAM), and compares against the AC threshold.
  4. Publishes a per-NFT report; pipeline fails if any threshold is missed.
NFT scenario AC Pass criterion
NFT-PERF-01 AC-4.1 E2E p95 ≤ 400 ms over 1000-frame replay (steady state)
NFT-PERF-02 AC-4.4 No frame batching detected (per-frame emit gap < 50 ms)
NFT-PERF-03 AC-NEW-1 Cold-start TTFF p95 < 30 s over 50 cold boots
NFT-PERF-04 AC-NEW-2 Spoofing-promotion latency p95 < 3 s on AP SITL + iNav SITL
NFT-LIM-01 AC-4.2 Memory < 8 GB shared (CPU + GPU) over 8 h replay
NFT-LIM-02 AC-NEW-3 FDR ring stays ≤ 64 GB; no silent drops
NFT-LIM-04 AC-NEW-5 Workstation thermal-baseline (chamber test deferred)
NFT-RES-03 AC-NEW-4 Monte Carlo: P(err > 500 m) < 0.1 %, P(err > 1 km) < 0.01 %, with stated 95 % CI
NFT-RES-04 AC-NEW-8 VISUAL_BLACKOUT mode transition ≤ 400 ms; covariance grows monotonically
NFT-SEC-01 AC-NEW-7 Cache-poisoning Monte Carlo on onboard side: P(misalign > 30 m) < 1 %, P(> 100 m) < 0.1 %, with 95 % CI
NFT-SEC-03 D-C8-9 MAVLink 2.0 signing handshake exercised; per-flight rotation logged to FDR
NFT-SEC-05 architecture.md Threat Model Network-egress-deny on production profile validated (DNS blackhole + iptables OUTPUT REJECT effective)
NFT-9 hot-soak AC-NEW-5 + AC-4.1 8 h at +50 °C ambient (chamber if available, else throttle-injection): p95 ≤ 400 ms throughout
NFT-10 SBOM CVE audit D-CROSS-CVE-1 SBOM clean of unpatched CVEs at audit time; failed scans blocking
IT-12 architecture.md ADR-001 + ADR-002 Comparative study replays the same fixture against research-binary's all-VIO matrix; report published

JetPack image build (release-only)

Runs on tag push to main. Produces gps-denied-jetpack-<semver>-<sha>.img (the deployable JetPack image) plus a signed checksum. The image is uploaded to the release bucket; the signature is signed by a release key stored in the Tier-1 secret manager.

Operator tooling tarball (release-only)

Bundles operator-tooling Docker image + mock-suite-sat-service Docker image + their compose file + a verification script + the documentation under _docs/02_document/. The tarball is uploaded to the release bucket alongside the JetPack image.

Caching Strategy

Cache Key Restore Keys
Python deps (Tier-1) pyproject.toml hash + Python version Python version only
C++ build deps (Tier-1) cmake/dependencies.cmake hash n/a — full rebuild on change
Docker layers (Tier-1) Dockerfile hash + dep-file hashes Dockerfile hash
TRT engine cache (Tier-2) manifest hash from _docs/02_document/data_model.md § 2.4 (engine_cache_bundle_hash) none (engine cache is per-tuple; reuse only on exact tuple match)
Tier-1 build artifacts git-sha branch name
Replay fixtures tests/fixtures/flight_derkachi/ content hash n/a

Parallelization

push → [ lint || unit (parallel per component) ] (Tier-1)
       → integration (Tier-1; sequential)
       → build matrix [deployment, research] (Tier-1; parallel)
       → [ SBOM diff || security ] (Tier-1; parallel)
       → push images (Tier-1; merge events only)
       → [ Tier-2 build || Tier-1 release prep (on tag) ] (parallel)
       → AC-bound NFTs (Tier-2; on merge events; sequential per scenario, parallel where the AC allows)
       → release (on tag; sequential)

Tier-1 stages from lint through push images typically complete in ≤ 12 min; Tier-2 NFTs take 14 h depending on the replay corpus length and the active scenario set.

Notifications

Event Channel Recipients
Build failure (Tier-1) Slack #gps-denied-ci onboard team
Tier-2 NFT failure Slack #gps-denied-ci + email onboard team + safety reviewer
Security alert (CVE block) Slack #gps-denied-ci + email onboard team + suite security
SBOM diff fail (ADR-002) Slack #gps-denied-ci + PR comment PR author
Deploy success (release) Slack #gps-denied-releases suite-wide
JetPack image signature mismatch Slack #gps-denied-ci + email + page release engineer + safety reviewer

Manual-trigger override

Initially, AC-bound NFTs may run on manual trigger only while the Tier-2 runner is being provisioned and the test fixtures are being authored. Until that gating is removed, the merge gate on dev excludes Tier-2; stage and main retain the full gate. The exception is documented in _docs/02_document/deployment/deployment_procedures.md § Tier-2 enablement.

Reference: Woodpecker CI two-workflow contract

The parent suite uses Woodpecker for some sibling components. If the project decides to migrate from GitHub Actions to Woodpecker, the canonical contract from .cursor/skills/deploy/templates/ci_cd_pipeline.md § Reference Implementation applies (.woodpecker/01-test.yml + .woodpecker/02-build-push.yml, multi-arch matrix). Migration is an explicit decision, NOT current state — current pipeline is GitHub Actions plus a self-hosted Jetson runner.