Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
13 KiB
GPS-Denied Onboard — CI/CD Pipeline
Date: 2026-05-09 (Plan Phase 2c — initial draft). Inputs:
_docs/02_document/architecture.md§ 3 (Deployment Model); ADR-002 (build-time exclusion); ADR-005 (Tier-1 / Tier-2 are first-class); ADR-007 (mock-suite-sat-serviceis an e2e-test fixture; reversed 2026-05-09 from the earlier "real component boundary" framing).
Pipeline Overview
The pipeline has two execution tiers (architecture.md ADR-005), reflected in two CI runner pools that share the same workflow definitions but differ in runner labels and active job set:
| Stage | Trigger | Runner | Quality Gate |
|---|---|---|---|
| Lint | Every push, every PR | Tier-1 (GitHub-hosted x86_64) | Zero lint errors (Python: ruff + mypy --strict; C++: clang-format --dry-run + clang-tidy; CMake: cmakelang) |
| Unit | Every push, every PR | Tier-1 | All unit tests pass; coverage ≥ 75 % per component, ≥ 90 % on safety-critical (C5 state estimator, C8 FC adapters) |
| Integration (Tier-1) | Every push, every PR | Tier-1 | Tier-1 integration suite passes (uses docker-compose.test.yml — companion + mock-sat + db + e2e-runner) |
| Build (Tier-1, both binaries) | Every push, every PR | Tier-1 | companion-tier1:deployment-<sha> AND companion-tier1:research-<sha> build green (ADR-002 dual-emit) |
| SBOM diff | After build | Tier-1 | Deployment SBOM excludes vins_mono, salad, etc.; research SBOM includes all strategies; PR fails on mismatch |
| Security | After build | Tier-1 | Zero unpatched critical / high CVEs (pip-audit + dotnet list package --vulnerable for mock-sat + Trivy on images) |
| Push images (Tier-1) | PR merge to dev, stage, main |
Tier-1 | Push succeeds; PRs do NOT push (avoids polluting registry) |
| Build (Tier-2 deployment binary) | PR merge to dev, stage, main |
Tier-2 (self-hosted Jetson) | Native build on Jetson green; deployment binary SBOM matches Tier-1 deployment SBOM |
| AC-bound NFTs (Tier-2) | PR merge to dev, stage, main; manual on PR |
Tier-2 | NFT-PERF-* (AC-4.1, AC-NEW-1, AC-NEW-2), NFT-LIM-* (AC-4.2, AC-NEW-3), NFT-RES-* (AC-NEW-4, AC-NEW-7), IT-12 (comparative study) all pass thresholds in tests/traceability-matrix.md |
| JetPack image build | Tag on main |
Tier-2 | JetPack 6.2 image built with deployment binary preinstalled, signed, and attested |
| Operator tooling tarball | Tag on main |
Tier-1 | Tarball contains C11 Tile Manager (both TileDownloader and TileUploader) + C12 Operator Pre-flight Tooling + mock-sat-service compose + verification script |
Tier-2 jobs are the only AC-bound jobs. Everything else runs on Tier-1.
Stage Details
Lint
Parallelized per language inside one Tier-1 workflow. Sequential per file is preserved in the report so a single failure is greppable in the log.
| Language | Tool | Rules |
|---|---|---|
| Python | ruff (formatter + linter) |
Project's pyproject.toml configures rules; ruff check --diff enforces that the committed code is formatted |
| Python types | mypy --strict |
Strict mode; all components must type-check (CI fails on error: ...) |
| C++ | clang-format --dry-run + clang-tidy |
.clang-format lives at repo root; clang-tidy checks listed in .clang-tidy |
| CMake | cmakelang (cmake-format --check) |
.cmake-format.yaml lives at repo root |
| YAML / Markdown | yamllint, markdownlint-cli |
Used for .github/, _docs/, docker-compose*.yml |
Unit
| Component | Framework | Coverage gate |
|---|---|---|
| Python (host code) | pytest + pytest-cov |
--cov-fail-under=75 per component; safety-critical (C5, C8) at --cov-fail-under=90 |
| C++ (per-strategy native builds) | gtest + lcov |
Per-strategy library ≥ 75 % line coverage; klt_ransac (mandatory simple-baseline) at ≥ 90 % |
| Mock sat service (.NET) | dotnet test + coverlet |
≥ 75 % line coverage on the mock |
Coverage report is published as a pipeline artifact (coverage/index.html). CI fails fast on threshold violation.
Integration (Tier-1)
Drives the autodev e2e contract: runs docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build from e2e/ and captures e2e/results/report.csv.
Coverage scenarios on Tier-1:
- All FT (Functional Test) and IT (Integration Test) scenarios that DO NOT require Jetson hardware (per
tests/traceability-matrix.md"Tier" column). mock-suite-sat-serviceinteractions including failure injection (latency, 5xx, partial responses, cache poisoning replay).- Cross-FC adapter behavior on SITL: ArduPilot Plane SITL runs as a sidecar container; iNav SITL runs as a sidecar container; companion's MAVLink and MSP2 paths are exercised against both.
- D-PROJ-2 contract: post-landing upload payload assembly + signature verification against the mock.
Build (Tier-1, both binaries)
Per ADR-002, every PR produces both binaries. The build job uses two parallel matrix entries with identical Dockerfile + different BUILD_* flags:
matrix:
build_kind:
- { tag: deployment, args: "BUILD_VINS_MONO=OFF BUILD_SALAD=OFF" }
- { tag: research, args: "BUILD_VINS_MONO=ON BUILD_SALAD=ON" }
The Dockerfile receives the args; cmake -DBUILD_VINS_MONO=$BUILD_VINS_MONO -DBUILD_SALAD=$BUILD_SALAD enforces the exclusion at the C++ build layer; setup.py / pyproject.toml reads the same env to skip importing excluded modules in the composition root validator. Both images are built; both must build green; both go through SBOM and security gates.
SBOM diff (ADR-002 enforcement)
- name: sbom-deployment
run: syft packages docker:gps-denied/companion-tier1:deployment-${{ github.sha }} -o spdx-json > sbom-deployment.json
- name: sbom-research
run: syft packages docker:gps-denied/companion-tier1:research-${{ github.sha }} -o spdx-json > sbom-research.json
- name: sbom-diff
run: python ci/sbom_diff.py --deployment sbom-deployment.json --research sbom-research.json
ci/sbom_diff.py enforces:
vins_mono,salad, and any module flagged "research-only" in_docs/02_document/components/MUST appear in research SBOM and MUST NOT appear in deployment SBOM.- The deployment SBOM is a strict subset of the research SBOM (i.e., the research binary contains everything the deployment binary contains plus the research-only modules).
- Both SBOMs are attached as workflow artifacts and as release artifacts on tag.
Security
| Check | Tool | Block on |
|---|---|---|
| Python dependency CVEs | pip-audit against pyproject.toml lockfile |
Critical / High severity |
| .NET dependency CVEs | dotnet list package --vulnerable --include-transitive |
Critical / High severity |
| C++ dependency CVEs | Manual audit via SBOM matched against NVD; osv-scanner for known submodule pins |
Critical / High severity |
| Image scan | Trivy on all CI-built images | Critical / High severity |
| OpenCV pin gate | CI step asserts the resolved OpenCV version is ≥ 4.12.0 (D-CROSS-CVE-1) |
Any version < 4.12.0 |
| GTSAM CVE re-scan | Monthly scheduled workflow against the GTSAM commit pinned in cmake/dependencies.cmake |
Any new published CVE |
Push images (Tier-1)
On push to dev, stage, main: tag images with ${BRANCH_NAME}-${BUILD_KIND}-${SHORT_SHA} and push to the registry. PR events do NOT push — PRs get test signal only.
Build (Tier-2 deployment binary)
Self-hosted Jetson runner (labels: [self-hosted, jetson, orin-nano-super]) builds the deployment binary natively. The build is not containerized (architecture.md § 3 explanation). After build:
- Compute the deployment-binary SBOM on Jetson.
- Compare it byte-for-byte (after canonicalization) against the Tier-1 deployment-binary SBOM. If they diverge, the PR fails — the two binaries must be built from the same source / same dependency pins.
- Cache the TRT engine builds on the Jetson runner's persistent cache (keyed by manifest hash) so subsequent CI runs reuse them.
AC-bound NFTs (Tier-2)
Run only on the Tier-2 runner. Each NFT corresponds to one or more acceptance-criterion entries in tests/traceability-matrix.md. The runner:
- Pulls the freshly-built deployment binary.
- Mounts the curated
tests/fixtures/flight_derkachi/replay corpus. - Runs each NFT scenario, captures jetson-stats telemetry (CPU, GPU, temp, throttle, RAM, VRAM), and compares against the AC threshold.
- Publishes a per-NFT report; pipeline fails if any threshold is missed.
| NFT scenario | AC | Pass criterion |
|---|---|---|
| NFT-PERF-01 | AC-4.1 | E2E p95 ≤ 400 ms over 1000-frame replay (steady state) |
| NFT-PERF-02 | AC-4.4 | No frame batching detected (per-frame emit gap < 50 ms) |
| NFT-PERF-03 | AC-NEW-1 | Cold-start TTFF p95 < 30 s over 50 cold boots |
| NFT-PERF-04 | AC-NEW-2 | Spoofing-promotion latency p95 < 3 s on AP SITL + iNav SITL |
| NFT-LIM-01 | AC-4.2 | Memory < 8 GB shared (CPU + GPU) over 8 h replay |
| NFT-LIM-02 | AC-NEW-3 | FDR ring stays ≤ 64 GB; no silent drops |
| NFT-LIM-04 | AC-NEW-5 | Workstation thermal-baseline (chamber test deferred) |
| NFT-RES-03 | AC-NEW-4 | Monte Carlo: P(err > 500 m) < 0.1 %, P(err > 1 km) < 0.01 %, with stated 95 % CI |
| NFT-RES-04 | AC-NEW-8 | VISUAL_BLACKOUT mode transition ≤ 400 ms; covariance grows monotonically |
| NFT-SEC-01 | AC-NEW-7 | Cache-poisoning Monte Carlo on onboard side: P(misalign > 30 m) < 1 %, P(> 100 m) < 0.1 %, with 95 % CI |
| NFT-SEC-03 | D-C8-9 | MAVLink 2.0 signing handshake exercised; per-flight rotation logged to FDR |
| NFT-SEC-05 | architecture.md Threat Model | Network-egress-deny on production profile validated (DNS blackhole + iptables OUTPUT REJECT effective) |
| NFT-9 hot-soak | AC-NEW-5 + AC-4.1 | 8 h at +50 °C ambient (chamber if available, else throttle-injection): p95 ≤ 400 ms throughout |
| NFT-10 SBOM CVE audit | D-CROSS-CVE-1 | SBOM clean of unpatched CVEs at audit time; failed scans blocking |
| IT-12 | architecture.md ADR-001 + ADR-002 | Comparative study replays the same fixture against research-binary's all-VIO matrix; report published |
JetPack image build (release-only)
Runs on tag push to main. Produces gps-denied-jetpack-<semver>-<sha>.img (the deployable JetPack image) plus a signed checksum. The image is uploaded to the release bucket; the signature is signed by a release key stored in the Tier-1 secret manager.
Operator tooling tarball (release-only)
Bundles operator-tooling Docker image + mock-suite-sat-service Docker image + their compose file + a verification script + the documentation under _docs/02_document/. The tarball is uploaded to the release bucket alongside the JetPack image.
Caching Strategy
| Cache | Key | Restore Keys |
|---|---|---|
| Python deps (Tier-1) | pyproject.toml hash + Python version |
Python version only |
| C++ build deps (Tier-1) | cmake/dependencies.cmake hash |
n/a — full rebuild on change |
| Docker layers (Tier-1) | Dockerfile hash + dep-file hashes |
Dockerfile hash |
| TRT engine cache (Tier-2) | manifest hash from _docs/02_document/data_model.md § 2.4 (engine_cache_bundle_hash) |
none (engine cache is per-tuple; reuse only on exact tuple match) |
| Tier-1 build artifacts | git-sha |
branch name |
| Replay fixtures | tests/fixtures/flight_derkachi/ content hash |
n/a |
Parallelization
push → [ lint || unit (parallel per component) ] (Tier-1)
→ integration (Tier-1; sequential)
→ build matrix [deployment, research] (Tier-1; parallel)
→ [ SBOM diff || security ] (Tier-1; parallel)
→ push images (Tier-1; merge events only)
→ [ Tier-2 build || Tier-1 release prep (on tag) ] (parallel)
→ AC-bound NFTs (Tier-2; on merge events; sequential per scenario, parallel where the AC allows)
→ release (on tag; sequential)
Tier-1 stages from lint through push images typically complete in ≤ 12 min; Tier-2 NFTs take 1–4 h depending on the replay corpus length and the active scenario set.
Notifications
| Event | Channel | Recipients |
|---|---|---|
| Build failure (Tier-1) | Slack #gps-denied-ci |
onboard team |
| Tier-2 NFT failure | Slack #gps-denied-ci + email |
onboard team + safety reviewer |
| Security alert (CVE block) | Slack #gps-denied-ci + email |
onboard team + suite security |
| SBOM diff fail (ADR-002) | Slack #gps-denied-ci + PR comment |
PR author |
| Deploy success (release) | Slack #gps-denied-releases |
suite-wide |
| JetPack image signature mismatch | Slack #gps-denied-ci + email + page |
release engineer + safety reviewer |
Manual-trigger override
Initially, AC-bound NFTs may run on manual trigger only while the Tier-2 runner is being provisioned and the test fixtures are being authored. Until that gating is removed, the merge gate on dev excludes Tier-2; stage and main retain the full gate. The exception is documented in _docs/02_document/deployment/deployment_procedures.md § Tier-2 enablement.
Reference: Woodpecker CI two-workflow contract
The parent suite uses Woodpecker for some sibling components. If the project decides to migrate from GitHub Actions to Woodpecker, the canonical contract from .cursor/skills/deploy/templates/ci_cd_pipeline.md § Reference Implementation applies (.woodpecker/01-test.yml + .woodpecker/02-build-push.yml, multi-arch matrix). Migration is an explicit decision, NOT current state — current pipeline is GitHub Actions plus a self-hosted Jetson runner.