azaion/gps-denied-onboard

Fork 0

mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-06-21 12:41:13 +00:00

Files

T

Oleksandr Bezdieniezhnykh a7b3e60716

ci/woodpecker/push/02-build-push Pipeline failed

Details

[autodev] Update Jetson test environment and satellite-provider integration

- Added `.env.test` to `.gitignore` to exclude test environment variables.
- Enhanced `docker-compose.test.jetson.yml` to include the real satellite-provider .NET service and its PostgreSQL database, replacing the mock service.
- Updated test execution policy to mandate all tests run exclusively on Jetson hardware, deprecating the previous two-tier model.
- Revised documentation in `_docs/LESSONS.md`, `_docs/02_document/tests/environment.md`, and `_docs/04_deploy/ci_cd_pipeline.md` to reflect the new testing strategy and environment setup.
- Improved `run-tests-jetson.sh` script to ensure proper environment variable handling and satellite-provider integration.

This commit aligns the testing framework with production environments, enhancing reliability and coverage.

2026-05-20 13:22:51 +03:00

14 KiB

Raw Blame History

GPS-Denied Onboard — CI/CD Pipeline

Date: 2026-05-09 (Plan Phase 2c — initial draft). Inputs: _docs/02_document/architecture.md § 3 (Deployment Model); ADR-002 (build-time exclusion); ADR-005 (Tier-1 / Tier-2 are first-class); ADR-007 (mock-suite-sat-service is an e2e-test fixture; reversed 2026-05-09 from the earlier "real component boundary" framing).

Test-execution policy update — 2026-05-20: all tests run on Jetson only. This Plan-phase document and ADR-005 are partially superseded — Tier-1 (workstation Docker / GitHub-hosted x86) is no longer used for ANY test stage (Lint, Unit, Integration, SBOM, Security below). Only the build/push lanes for companion-tier1 and operator-orchestrator images may continue to run on x86 agents, since those images are registry artefacts consumed downstream (operator workstations). For the operative CI contract see _docs/04_deploy/ci_cd_pipeline.md; for the test-environment policy see _docs/02_document/tests/environment.md (the source of truth on this decision).

Pipeline Overview

The pipeline has two execution tiers (architecture.md ADR-005), reflected in two CI runner pools that share the same workflow definitions but differ in runner labels and active job set:

Stage	Trigger	Runner	Quality Gate
Lint	Every push, every PR	Tier-1 (GitHub-hosted x86_64)	Zero lint errors (Python: `ruff` + `mypy --strict`; C++: `clang-format --dry-run` + `clang-tidy`; CMake: `cmakelang`)
Unit	Every push, every PR	Tier-1	All unit tests pass; coverage ≥ 75 % per component, ≥ 90 % on safety-critical (C5 state estimator, C8 FC adapters)
Integration (Tier-1)	Every push, every PR	Tier-1	Tier-1 integration suite passes (uses `docker-compose.test.yml` — companion + mock-sat + db + e2e-runner)
Build (Tier-1, both binaries)	Every push, every PR	Tier-1	`companion-tier1:deployment-<sha>` AND `companion-tier1:research-<sha>` build green (ADR-002 dual-emit)
SBOM diff	After build	Tier-1	Deployment SBOM excludes `vins_mono`, `salad`, etc.; research SBOM includes all strategies; PR fails on mismatch
Security	After build	Tier-1	Zero unpatched critical / high CVEs (`pip-audit` + `dotnet list package --vulnerable` for mock-sat + Trivy on images)
Push images (Tier-1)	PR merge to `dev`, `stage`, `main`	Tier-1	Push succeeds; PRs do NOT push (avoids polluting registry)
Build (Tier-2 deployment binary)	PR merge to `dev`, `stage`, `main`	Tier-2 (self-hosted Jetson)	Native build on Jetson green; deployment binary SBOM matches Tier-1 deployment SBOM
AC-bound NFTs (Tier-2)	PR merge to `dev`, `stage`, `main`; manual on PR	Tier-2	NFT-PERF-* (AC-4.1, AC-NEW-1, AC-NEW-2), NFT-LIM-* (AC-4.2, AC-NEW-3), NFT-RES-* (AC-NEW-4, AC-NEW-7), IT-12 (comparative study) all pass thresholds in `tests/traceability-matrix.md`
JetPack image build	Tag on `main`	Tier-2	JetPack 6.2 image built with deployment binary preinstalled, signed, and attested
Operator tooling tarball	Tag on `main`	Tier-1	Tarball contains C11 Tile Manager (both `TileDownloader` and `TileUploader`) + C12 Operator Pre-flight Orchestrator + mock-sat-service compose + verification script

Tier-2 jobs are the only AC-bound jobs. Everything else runs on Tier-1.

Stage Details

Lint

Parallelized per language inside one Tier-1 workflow. Sequential per file is preserved in the report so a single failure is greppable in the log.

Language	Tool	Rules
Python	`ruff` (formatter + linter)	Project's `pyproject.toml` configures rules; `ruff check --diff` enforces that the committed code is formatted
Python types	`mypy --strict`	Strict mode; all components must type-check (CI fails on `error: ...`)
C++	`clang-format --dry-run` + `clang-tidy`	`.clang-format` lives at repo root; `clang-tidy` checks listed in `.clang-tidy`
CMake	`cmakelang` (`cmake-format --check`)	`.cmake-format.yaml` lives at repo root
YAML / Markdown	`yamllint`, `markdownlint-cli`	Used for `.github/`, `_docs/`, `docker-compose*.yml`

Unit

Component	Framework	Coverage gate
Python (host code)	`pytest` + `pytest-cov`	`--cov-fail-under=75` per component; safety-critical (C5, C8) at `--cov-fail-under=90`
C++ (per-strategy native builds)	`gtest` + `lcov`	Per-strategy library `≥ 75 %` line coverage; `klt_ransac` (mandatory simple-baseline) at `≥ 90 %`
Mock sat service (.NET)	`dotnet test` + `coverlet`	`≥ 75 %` line coverage on the mock

Coverage report is published as a pipeline artifact (coverage/index.html). CI fails fast on threshold violation.

Integration (Tier-1)

Drives the autodev e2e contract: runs docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build from e2e/ and captures e2e/results/report.csv.

Coverage scenarios on Tier-1:

All FT (Functional Test) and IT (Integration Test) scenarios that DO NOT require Jetson hardware (per tests/traceability-matrix.md "Tier" column).
mock-suite-sat-service interactions including failure injection (latency, 5xx, partial responses, cache poisoning replay).
Cross-FC adapter behavior on SITL: ArduPilot Plane SITL runs as a sidecar container; iNav SITL runs as a sidecar container; companion's MAVLink and MSP2 paths are exercised against both.
D-PROJ-2 contract: post-landing upload payload assembly + signature verification against the mock.

Build (Tier-1, both binaries)

Per ADR-002, every PR produces both binaries. The build job uses two parallel matrix entries with identical Dockerfile + different BUILD_* flags:

matrix:
  build_kind:
    - { tag: deployment, args: "BUILD_VINS_MONO=OFF BUILD_SALAD=OFF" }
    - { tag: research,   args: "BUILD_VINS_MONO=ON  BUILD_SALAD=ON" }

The Dockerfile receives the args; cmake -DBUILD_VINS_MONO=$BUILD_VINS_MONO -DBUILD_SALAD=$BUILD_SALAD enforces the exclusion at the C++ build layer; setup.py / pyproject.toml reads the same env to skip importing excluded modules in the composition root validator. Both images are built; both must build green; both go through SBOM and security gates.

SBOM diff (ADR-002 enforcement)

- name: sbom-deployment
  run: syft packages docker:gps-denied/companion-tier1:deployment-${{ github.sha }} -o spdx-json > sbom-deployment.json

- name: sbom-research
  run: syft packages docker:gps-denied/companion-tier1:research-${{ github.sha }} -o spdx-json > sbom-research.json

- name: sbom-diff
  run: python ci/sbom_diff.py --deployment sbom-deployment.json --research sbom-research.json

ci/sbom_diff.py enforces:

vins_mono, salad, and any module flagged "research-only" in _docs/02_document/components/ MUST appear in research SBOM and MUST NOT appear in deployment SBOM.
The deployment SBOM is a strict subset of the research SBOM (i.e., the research binary contains everything the deployment binary contains plus the research-only modules).
Both SBOMs are attached as workflow artifacts and as release artifacts on tag.

Security

Check	Tool	Block on
Python dependency CVEs	`pip-audit` against `pyproject.toml` lockfile	Critical / High severity
.NET dependency CVEs	`dotnet list package --vulnerable --include-transitive`	Critical / High severity
C++ dependency CVEs	Manual audit via SBOM matched against NVD; `osv-scanner` for known submodule pins	Critical / High severity
Image scan	Trivy on all CI-built images	Critical / High severity
OpenCV pin gate	CI step asserts the resolved OpenCV version is within the cycle-1 relaxed band `>=4.11.0.86,<4.12` (D-CROSS-CVE-1 — see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`; original target `>=4.12.0` replays once gtsam ships numpy-2 wheels)	Any version `< 4.11.0.86` OR `>= 4.12` while leftover is open
GTSAM CVE re-scan	Monthly scheduled workflow against the GTSAM commit pinned in `cmake/dependencies.cmake`	Any new published CVE

Push images (Tier-1)

On push to dev, stage, main: tag images with ${BRANCH_NAME}-${BUILD_KIND}-${SHORT_SHA} and push to the registry. PR events do NOT push — PRs get test signal only.

Build (Tier-2 deployment binary)

Self-hosted Jetson runner (labels: [self-hosted, jetson, orin-nano-super]) builds the deployment binary natively. The build is not containerized (architecture.md § 3 explanation). After build:

Compute the deployment-binary SBOM on Jetson.
Compare it byte-for-byte (after canonicalization) against the Tier-1 deployment-binary SBOM. If they diverge, the PR fails — the two binaries must be built from the same source / same dependency pins.
Cache the TRT engine builds on the Jetson runner's persistent cache (keyed by manifest hash) so subsequent CI runs reuse them.

AC-bound NFTs (Tier-2)

Run only on the Tier-2 runner. Each NFT corresponds to one or more acceptance-criterion entries in tests/traceability-matrix.md. The runner:

Pulls the freshly-built deployment binary.
Mounts the curated tests/fixtures/flight_derkachi/ replay corpus.
Runs each NFT scenario, captures jetson-stats telemetry (CPU, GPU, temp, throttle, RAM, VRAM), and compares against the AC threshold.
Publishes a per-NFT report; pipeline fails if any threshold is missed.

NFT scenario	AC	Pass criterion
NFT-PERF-01	AC-4.1	E2E p95 ≤ 400 ms over 1000-frame replay (steady state)
NFT-PERF-02	AC-4.4	No frame batching detected (per-frame emit gap < 50 ms)
NFT-PERF-03	AC-NEW-1	Cold-start TTFF p95 < 30 s over 50 cold boots
NFT-PERF-04	AC-NEW-2	Spoofing-promotion latency p95 < 3 s on AP SITL + iNav SITL
NFT-LIM-01	AC-4.2	Memory < 8 GB shared (CPU + GPU) over 8 h replay
NFT-LIM-02	AC-NEW-3	FDR ring stays ≤ 64 GB; no silent drops
NFT-LIM-04	AC-NEW-5	Workstation thermal-baseline (chamber test deferred)
NFT-RES-03	AC-NEW-4	Monte Carlo: P(err > 500 m) < 0.1 %, P(err > 1 km) < 0.01 %, with stated 95 % CI
NFT-RES-04	AC-NEW-8	VISUAL_BLACKOUT mode transition ≤ 400 ms; covariance grows monotonically
NFT-SEC-01	AC-NEW-7	Cache-poisoning Monte Carlo on onboard side: P(misalign > 30 m) < 1 %, P(> 100 m) < 0.1 %, with 95 % CI
NFT-SEC-03	D-C8-9	MAVLink 2.0 signing handshake exercised; per-flight rotation logged to FDR
NFT-SEC-05	architecture.md Threat Model	Network-egress-deny on production profile validated (DNS blackhole + iptables OUTPUT REJECT effective)
NFT-9 hot-soak	AC-NEW-5 + AC-4.1	8 h at +50 °C ambient (chamber if available, else throttle-injection): p95 ≤ 400 ms throughout
NFT-10 SBOM CVE audit	D-CROSS-CVE-1	SBOM clean of unpatched CVEs at audit time; failed scans blocking
IT-12	architecture.md ADR-001 + ADR-002	Comparative study replays the same fixture against research-binary's all-VIO matrix; report published

JetPack image build (release-only)

Runs on tag push to main. Produces gps-denied-jetpack-<semver>-<sha>.img (the deployable JetPack image) plus a signed checksum. The image is uploaded to the release bucket; the signature is signed by a release key stored in the Tier-1 secret manager.

Operator tooling tarball (release-only)

Bundles operator-orchestrator Docker image + mock-suite-sat-service Docker image + their compose file + a verification script + the documentation under _docs/02_document/. The tarball is uploaded to the release bucket alongside the JetPack image.

Caching Strategy

Cache	Key	Restore Keys
Python deps (Tier-1)	`pyproject.toml` hash + Python version	Python version only
C++ build deps (Tier-1)	`cmake/dependencies.cmake` hash	n/a — full rebuild on change
Docker layers (Tier-1)	`Dockerfile` hash + dep-file hashes	Dockerfile hash
TRT engine cache (Tier-2)	manifest hash from `_docs/02_document/data_model.md` § 2.4 (`engine_cache_bundle_hash`)	none (engine cache is per-tuple; reuse only on exact tuple match)
Tier-1 build artifacts	`git-sha`	branch name
Replay fixtures	`tests/fixtures/flight_derkachi/` content hash	n/a

Parallelization

push → [ lint || unit (parallel per component) ] (Tier-1)
       → integration (Tier-1; sequential)
       → build matrix [deployment, research] (Tier-1; parallel)
       → [ SBOM diff || security ] (Tier-1; parallel)
       → push images (Tier-1; merge events only)
       → [ Tier-2 build || Tier-1 release prep (on tag) ] (parallel)
       → AC-bound NFTs (Tier-2; on merge events; sequential per scenario, parallel where the AC allows)
       → release (on tag; sequential)

Tier-1 stages from lint through push images typically complete in ≤ 12 min; Tier-2 NFTs take 1–4 h depending on the replay corpus length and the active scenario set.

Notifications

Event	Channel	Recipients
Build failure (Tier-1)	Slack `#gps-denied-ci`	onboard team
Tier-2 NFT failure	Slack `#gps-denied-ci` + email	onboard team + safety reviewer
Security alert (CVE block)	Slack `#gps-denied-ci` + email	onboard team + suite security
SBOM diff fail (ADR-002)	Slack `#gps-denied-ci` + PR comment	PR author
Deploy success (release)	Slack `#gps-denied-releases`	suite-wide
JetPack image signature mismatch	Slack `#gps-denied-ci` + email + page	release engineer + safety reviewer

Manual-trigger override

Initially, AC-bound NFTs may run on manual trigger only while the Tier-2 runner is being provisioned and the test fixtures are being authored. Until that gating is removed, the merge gate on dev excludes Tier-2; stage and main retain the full gate. The exception is documented in _docs/02_document/deployment/deployment_procedures.md § Tier-2 enablement.

Reference: Woodpecker CI two-workflow contract

The parent suite uses Woodpecker for some sibling components. If the project decides to migrate from GitHub Actions to Woodpecker, the canonical contract from .cursor/skills/deploy/templates/ci_cd_pipeline.md § Reference Implementation applies (.woodpecker/01-test.yml + .woodpecker/02-build-push.yml, multi-arch matrix). Migration is an explicit decision, NOT current state — current pipeline is GitHub Actions plus a self-hosted Jetson runner.

14 KiB Raw Blame History Unescape Escape