mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 16:41:13 +00:00
[autodev] Update Jetson test environment and satellite-provider integration
ci/woodpecker/push/02-build-push Pipeline failed
ci/woodpecker/push/02-build-push Pipeline failed
- Added `.env.test` to `.gitignore` to exclude test environment variables. - Enhanced `docker-compose.test.jetson.yml` to include the real satellite-provider .NET service and its PostgreSQL database, replacing the mock service. - Updated test execution policy to mandate all tests run exclusively on Jetson hardware, deprecating the previous two-tier model. - Revised documentation in `_docs/LESSONS.md`, `_docs/02_document/tests/environment.md`, and `_docs/04_deploy/ci_cd_pipeline.md` to reflect the new testing strategy and environment setup. - Improved `run-tests-jetson.sh` script to ensure proper environment variable handling and satellite-provider integration. This commit aligns the testing framework with production environments, enhancing reliability and coverage.
This commit is contained in:
@@ -3,6 +3,18 @@
|
||||
> Date: 2026-05-09 (Plan Phase 2c — initial draft).
|
||||
> Inputs: `_docs/02_document/architecture.md` § 3 (Deployment Model); ADR-002 (build-time exclusion); ADR-005 (Tier-1 / Tier-2 are first-class); ADR-007 (`mock-suite-sat-service` is an e2e-test fixture; reversed 2026-05-09 from the earlier "real component boundary" framing).
|
||||
|
||||
> **Test-execution policy update — 2026-05-20**: **all tests run on
|
||||
> Jetson only.** This Plan-phase document and ADR-005 are partially
|
||||
> superseded — Tier-1 (workstation Docker / GitHub-hosted x86) is no
|
||||
> longer used for ANY test stage (Lint, Unit, Integration, SBOM, Security
|
||||
> below). Only the build/push lanes for `companion-tier1` and
|
||||
> `operator-orchestrator` images may continue to run on x86 agents,
|
||||
> since those images are registry artefacts consumed downstream (operator
|
||||
> workstations). For the operative CI contract see
|
||||
> `_docs/04_deploy/ci_cd_pipeline.md`; for the test-environment policy
|
||||
> see `_docs/02_document/tests/environment.md` (the source of truth on
|
||||
> this decision).
|
||||
|
||||
## Pipeline Overview
|
||||
|
||||
The pipeline has **two execution tiers** (architecture.md ADR-005), reflected in two CI runner pools that share the same workflow definitions but differ in runner labels and active job set:
|
||||
|
||||
@@ -1,5 +1,18 @@
|
||||
# Test Environment
|
||||
|
||||
> **Active policy — 2026-05-20**: **all tests run on Jetson only.** The Jetson
|
||||
> Orin Nano Super (or a Jetson-equivalent arm64 agent) is the single canonical
|
||||
> test environment for every tier of testing — unit, integration, blackbox /
|
||||
> e2e, performance, resilience, security, resource-limit. Workstation x86
|
||||
> Docker (the historical "Tier-1" path) is **deprecated** and is not a
|
||||
> supported test environment going forward; the Tier-1 sections below are
|
||||
> retained as historical reference / traceability only. CI test pipelines
|
||||
> target the colocated arm64 Jetson Woodpecker agent (see
|
||||
> `_docs/04_deploy/ci_cd_pipeline.md`); local-development test runs SHOULD
|
||||
> use `scripts/run-tests-jetson.sh` against the configured `jetson-e2e` SSH
|
||||
> alias rather than `scripts/run-tests.sh`. This decision supersedes the
|
||||
> 2026-05-09 "both" decision recorded in the § Test Execution section.
|
||||
|
||||
## Overview
|
||||
|
||||
**System under test (SUT)**: `gps-denied-onboard` companion-PC service that produces WGS84 position estimates from nav-camera frames + FC IMU/attitude and emits them to the FC over its native external-positioning interface. Public boundaries (the only surfaces tests interact with):
|
||||
@@ -15,14 +28,19 @@
|
||||
|
||||
## Two-tier execution profile
|
||||
|
||||
This project requires two distinct test environments because the production target is Jetson hardware and AC-4.1/AC-4.2/AC-NEW-5 cannot be honestly validated on a generic x86 dev workstation.
|
||||
> **SUPERSEDED — 2026-05-20**: the two-tier model below is retained for
|
||||
> historical traceability. The active policy is **Jetson-only** (see banner
|
||||
> at the top of this doc). Tier-1 (workstation Docker) is deprecated; only
|
||||
> the Tier-2 row continues to describe a supported environment.
|
||||
|
||||
This project originally specified two distinct test environments because the production target is Jetson hardware and AC-4.1/AC-4.2/AC-NEW-5 cannot be honestly validated on a generic x86 dev workstation.
|
||||
|
||||
| Tier | Hardware | What it covers | What it skips |
|
||||
|------|----------|----------------|---------------|
|
||||
| **Tier-1 (workstation Docker)** | x86 dev workstation, optional NVIDIA dGPU for TensorRT validation | All `FT-*` correctness, schema, `NFT-RES-*` resilience scenarios, `NFT-SEC-*` security scenarios, `NFT-LIM-*` storage budgets | Any AC whose pass criterion is bound to Jetson Orin Nano Super wall-clock latency or thermal envelope: AC-4.1 / AC-4.2 / AC-NEW-1 / AC-NEW-5 |
|
||||
| **Tier-2 (Jetson hardware loop)** | Jetson Orin Nano Super (pinned hardware per `restrictions.md`), thermal chamber for AC-NEW-5 | AC-4.1 latency p95, AC-4.2 memory, AC-NEW-1 cold-start TTFF, AC-NEW-5 thermal envelope (chamber-only) | Iteration speed (manual hardware time) |
|
||||
| **Tier-1 (workstation Docker)** *(deprecated 2026-05-20)* | x86 dev workstation, optional NVIDIA dGPU for TensorRT validation | All `FT-*` correctness, schema, `NFT-RES-*` resilience scenarios, `NFT-SEC-*` security scenarios, `NFT-LIM-*` storage budgets | Any AC whose pass criterion is bound to Jetson Orin Nano Super wall-clock latency or thermal envelope: AC-4.1 / AC-4.2 / AC-NEW-1 / AC-NEW-5 |
|
||||
| **Jetson (canonical, 2026-05-20)** *(formerly "Tier-2")* | Jetson Orin Nano Super (pinned hardware per `restrictions.md`), thermal chamber for AC-NEW-5 | Everything: `FT-*` correctness, schema, `NFT-RES-*`, `NFT-SEC-*`, `NFT-LIM-*`, `NFT-PERF-*` (AC-4.1 latency p95), AC-4.2 memory, AC-NEW-1 cold-start TTFF, AC-NEW-5 thermal envelope (chamber-only) | Nothing — anything that doesn't run here doesn't run at all |
|
||||
|
||||
CI runs Tier-1 on every PR. Tier-2 runs on hardware-attached runners on a nightly cadence and pre-release gate; results are imported into the same CSV report format as Tier-1.
|
||||
CI runs the Jetson pipeline (`01-test.yml`) on the colocated arm64 Jetson agent. Chamber-only AC-NEW-5 runs on `self-hosted-jetson-orin-chamber` on the documented quarterly + pre-release cadence; results are recorded in the same CSV report format.
|
||||
|
||||
## Docker Environment (Tier-1)
|
||||
|
||||
@@ -213,20 +231,19 @@ The captured-fixture builder framework (`e2e/fixtures/sitl_replay_builder/`) reg
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
**When to run**:
|
||||
- Tier-1 (workstation Docker): on every PR to `dev` branch and nightly on `dev` HEAD.
|
||||
- Tier-2 (Jetson hardware loop): nightly on `dev`, and as a hard gate before any release tag.
|
||||
- AC-NEW-5 thermal envelope: monthly on chamber-attached Jetson runner; failures block release tags only.
|
||||
> **2026-05-20**: rewritten for the Jetson-only policy. Tier-1 references in the historical sub-sections below are no longer operative.
|
||||
|
||||
**Pipeline stage**:
|
||||
- Tier-1 fits in the standard CI matrix as a single job (~30-45 min wall-clock for the full suite at first cut).
|
||||
- Tier-2 is a separate workflow on `self-hosted-jetson-orin` runner.
|
||||
**When to run** (active policy):
|
||||
|
||||
**Gate behavior**: Tier-1 blocks PR merge on any test failure. Tier-2 blocks release tag on any test failure. Chamber tests are warning-only on PRs and blocking on release tags.
|
||||
- Jetson (colocated arm64 Woodpecker agent): on every PR to `dev` branch, nightly on `dev` HEAD, and as a hard gate before any release tag.
|
||||
- AC-NEW-5 thermal envelope: quarterly on the chamber-attached Jetson runner; failures block release tags only.
|
||||
|
||||
**Pipeline stage**: a single Jetson workflow (`.woodpecker/01-test.yml`) on the `self-hosted-jetson-orin` runner exercises the full suite — there is no longer a parallel x86 lane.
|
||||
|
||||
**Gate behavior**: Jetson blocks PR merge on any test failure and blocks release tags on any test failure. Chamber tests are warning-only on PRs and blocking on release tags.
|
||||
|
||||
**Timeout**:
|
||||
- Tier-1: 60 min per matrix entry.
|
||||
- Tier-2: 4 hr per matrix entry (allows for full Derkachi 8 min replay × ~10 scenarios + cold-boot loops).
|
||||
- Jetson: 4 hr per matrix entry (allows for full Derkachi 8 min replay × ~10 scenarios + cold-boot loops).
|
||||
- Thermal chamber AC-NEW-5: 9 hr (8 h hot-soak + setup/teardown).
|
||||
|
||||
## Reporting
|
||||
@@ -246,7 +263,17 @@ The captured-fixture builder framework (`e2e/fixtures/sitl_replay_builder/`) reg
|
||||
|
||||
## Test Execution
|
||||
|
||||
**Decision (2026-05-09)**: **both** — Tier-1 Docker + Tier-2 Jetson hardware loop. Confirmed at the Hardware-Dependency Assessment Step 4 gate.
|
||||
**Decision (2026-05-20)** — **Jetson only.** Supersedes the 2026-05-09 "both" decision below. All tests (unit, integration, blackbox / e2e, performance, resilience, security, resource-limit) run on the Jetson Orin Nano Super (or a Jetson-equivalent arm64 agent). The workstation x86 Docker path is deprecated. Rationale captured in `_docs/LESSONS.md` (2026-05-20 entry): repeated workstation-vs-Jetson environment divergences (Dockerfile build order, missing `libgl1`, gtsam wheel availability, venv symlink resolution, lazy-import side-effect registration) were producing false-negative test runs and consuming engineering time without ever exercising the production-equivalent hardware path.
|
||||
|
||||
**Operational entry points**:
|
||||
- Local-development: `scripts/run-tests-jetson.sh` against the configured `jetson-e2e` SSH alias (see `_docs/03_implementation/jetson_harness_setup.md` for one-time setup).
|
||||
- CI: `.woodpecker/01-test.yml` on the colocated arm64 Jetson agent (see `_docs/04_deploy/ci_cd_pipeline.md`).
|
||||
|
||||
The remainder of this section preserves the original 2026-05-09 decision context for traceability.
|
||||
|
||||
---
|
||||
|
||||
**Decision (2026-05-09, SUPERSEDED)**: **both** — Tier-1 Docker + Tier-2 Jetson hardware loop. Confirmed at the Hardware-Dependency Assessment Step 4 gate.
|
||||
|
||||
### Hardware dependencies found (Phase 3 → Hardware Assessment scan)
|
||||
|
||||
@@ -340,8 +367,13 @@ When invoked on a control host (typical), the script SSH-orchestrates the Jetson
|
||||
|
||||
### CI runner mapping
|
||||
|
||||
- `ubuntu-24.04` (GitHub-hosted) → Tier-1 Docker, every PR + nightly. ~30-45 min per matrix entry.
|
||||
- `self-hosted-jetson-orin` → Tier-2 Jetson, nightly on `dev` HEAD + pre-release gate. ~4 hr per matrix entry.
|
||||
**Active mapping (2026-05-20)**:
|
||||
|
||||
- `self-hosted-jetson-orin` (colocated arm64 Woodpecker agent) → all test runs, every PR + nightly + pre-release. ~4 hr per matrix entry. **This is the single canonical CI test runner.**
|
||||
- `self-hosted-jetson-orin-chamber` → AC-NEW-5 hot-soak. Quarterly + before any release tag. ~9 hr.
|
||||
|
||||
**Removed (2026-05-20)**:
|
||||
|
||||
- ~~`ubuntu-24.04` (GitHub-hosted) → Tier-1 Docker, every PR + nightly. ~30-45 min per matrix entry.~~ — Tier-1 workstation Docker is deprecated; no x86 CI agent participates in the test path. CI build-push lanes that ship images may still run on amd64 if/when that matrix dimension is uncommented in `02-build-push.yml`, but the test lane is Jetson-only.
|
||||
|
||||
**Matrix dimensions**: `FC_ADAPTER × VIO_STRATEGY × build_kind` where `build_kind ∈ {production, research}`. Production `vins_mono` is excluded (D-C1-1-SUB-A locked); research includes all three VioStrategy values.
|
||||
|
||||
@@ -137,6 +137,36 @@ Need ≥ 30 GB free on `/var/lib/docker`. Swap should be at least 4 GB
|
||||
|
||||
## Running the harness
|
||||
|
||||
### Pre-flight (one-time, then on JWT secret rotation)
|
||||
|
||||
AZ-688 added the real `../satellite-provider` .NET service to the Jetson
|
||||
compose graph. Two extra setup steps before the first run:
|
||||
|
||||
```bash
|
||||
# 1. Sibling repo must be checked out alongside gps-denied-onboard/.
|
||||
# The harness rsyncs both repos to the Jetson; the relative `../satellite-provider`
|
||||
# path in docker-compose.test.jetson.yml resolves identically on Mac and Jetson.
|
||||
ls ../satellite-provider/SatelliteProvider.sln # sanity check
|
||||
|
||||
# 2. Copy the env template and fill in the dev JWT secret. .env.test is
|
||||
# gitignored; the script refuses to start if it's missing or if any
|
||||
# of JWT_SECRET / JWT_ISSUER / JWT_AUDIENCE are unset.
|
||||
cp .env.test.example .env.test
|
||||
# Generate a fresh dev secret (≥32 bytes for HMAC-SHA256):
|
||||
openssl rand -hex 32
|
||||
# Paste into JWT_SECRET=… in .env.test. The same secret is later used by
|
||||
# AZ-690 (dev JWT minting helper) to sign tokens that this same provider
|
||||
# validates. Issuer/audience defaults are pre-filled.
|
||||
```
|
||||
|
||||
The dev TLS cert (`../satellite-provider/certs/{api.pfx,api.crt,api.key}`)
|
||||
is regenerated on demand by `scripts/ensure-dev-cert.sh`, which
|
||||
`run-tests-jetson.sh` calls automatically. The cert is self-signed,
|
||||
gitignored in both repos, and pinned to SAN `api`/`satellite-provider`/
|
||||
`localhost`/`127.0.0.1` — see the script for the openssl recipe.
|
||||
|
||||
### Run
|
||||
|
||||
From the developer Mac, repo root:
|
||||
|
||||
```bash
|
||||
@@ -145,11 +175,18 @@ bash scripts/run-tests-jetson.sh
|
||||
|
||||
What happens:
|
||||
|
||||
1. `rsync` source → `jetson-e2e:~/gps-denied-onboard/` (excludes `.git`,
|
||||
1. Load `.env.test` (fail-fast if missing / JWT vars unset / `JWT_SECRET` < 32 bytes).
|
||||
2. `scripts/ensure-dev-cert.sh` on the Mac — idempotent dev TLS cert generation
|
||||
into `../satellite-provider/certs/`.
|
||||
3. `rsync` source → `jetson-e2e:~/gps-denied-onboard/` (excludes `.git`,
|
||||
`__pycache__`, build artefacts; LFS pointers transfer as text).
|
||||
2. `ssh jetson-e2e docker compose -f docker-compose.test.jetson.yml build e2e-runner`
|
||||
3. `ssh jetson-e2e docker compose ... up --abort-on-container-exit --exit-code-from e2e-runner`
|
||||
4. stdout / stderr stream to the Mac terminal; exit code propagates.
|
||||
4. `rsync` `../satellite-provider/` → `jetson-e2e:~/satellite-provider/`
|
||||
(sibling of `gps-denied-onboard/` so the compose path resolves).
|
||||
5. `ssh jetson-e2e docker compose ... build e2e-runner satellite-provider`
|
||||
(env vars exported through the heredoc so the upstream compose's
|
||||
`${JWT_SECRET}` interpolation resolves on the Jetson side).
|
||||
6. `ssh jetson-e2e docker compose ... up --abort-on-container-exit --exit-code-from e2e-runner`.
|
||||
7. stdout / stderr stream to the Mac terminal; exit code propagates.
|
||||
|
||||
Override the alias or remote dir if your setup differs:
|
||||
|
||||
@@ -158,6 +195,11 @@ JETSON_SSH_ALIAS=other-host JETSON_REMOTE_DIR=~/somewhere/else \
|
||||
bash scripts/run-tests-jetson.sh
|
||||
```
|
||||
|
||||
`JETSON_REMOTE_DIR` MUST be a path whose parent directory is writable —
|
||||
the harness places `satellite-provider/` next to it. With the default
|
||||
`~/gps-denied-onboard`, the satellite-provider lands at
|
||||
`~/satellite-provider/` on the Jetson.
|
||||
|
||||
## Smoke vs. Reality Gate split — at a glance
|
||||
|
||||
| Test category | Marker | Colima (Tier-1) | Jetson (Tier-2) |
|
||||
@@ -190,7 +232,14 @@ JETSON_SSH_ALIAS=other-host JETSON_REMOTE_DIR=~/somewhere/else \
|
||||
## Related Jira
|
||||
|
||||
* AZ-615 — this harness (Jetson runner story)
|
||||
* AZ-616 — replace `mock-sat` with real `../satellite-provider` service
|
||||
* AZ-616 — umbrella: replace `mock-sat` with real `../satellite-provider` service
|
||||
* AZ-688 — Compose-include real satellite-provider + Postgres (this doc)
|
||||
* AZ-689 — Seed Derkachi-bbox fixture tile set for hermetic e2e
|
||||
* AZ-690 — Long-lived dev JWT minting helper
|
||||
* AZ-691 — Python `SatelliteProviderClient`
|
||||
* AZ-692 — Wire client into composition root; retire `mock-sat`
|
||||
* AZ-693 — Docs: client contract + test env + containerization
|
||||
* AZ-694 — AC-8 unskip + diagnose (sibling Story, not a subtask)
|
||||
* AZ-617 — mark heavy ACs with `tier2` (already applied; this story
|
||||
documents and verifies the auto-skip)
|
||||
* AZ-614 — tlog time-base mismatch (currently blocks the heavy ACs
|
||||
|
||||
@@ -9,6 +9,16 @@
|
||||
> is now stale and will be reconciled in autodev's existing-code Step 13
|
||||
> (Update Docs); the operative CI contract is here.
|
||||
|
||||
> **Test-execution policy — 2026-05-20**: all tests run on the Jetson
|
||||
> (colocated arm64 Woodpecker agent) only. The historical "Tier-1
|
||||
> workstation Docker" path is deprecated. The `companion-tier1` and
|
||||
> `operator-orchestrator` images below are still built and pushed for
|
||||
> registry distribution (operator workstations consume the operator
|
||||
> image; the cycle-2 `companion-jetson` image is the planned successor
|
||||
> to `companion-tier1`), but no x86 agent participates in the **test**
|
||||
> lane — `01-test.yml` is Jetson-only. Source of truth for the policy:
|
||||
> `_docs/02_document/tests/environment.md`.
|
||||
|
||||
## Decision Record (cycle-1 scope)
|
||||
|
||||
| Decision | Choice | Rationale |
|
||||
|
||||
@@ -6,6 +6,12 @@ Ring buffer: trim to the last 15 entries. Categories: `estimation · architectur
|
||||
|
||||
---
|
||||
|
||||
## 2026-05-20 — [testing] Two-tier test policy retired — all tests run on Jetson only
|
||||
|
||||
**Trigger**: a `/test-run` invocation on the workstation Tier-1 Docker stack uncovered eight categorically distinct, sequential bugs in the supposedly-supported workstation path (Dockerfile `COPY` ordering before editable install, base-image pip too old for `gtsam` pre-release wheels, runtime stage missing the `python3` metapackage that `python3 -m venv` symlinks against, missing `libgl1` / `libglib2.0-0` for `cv2` import, missing `runtime_root/__main__.py` shim, lazy import that never registered the `c6_tile_cache` config block, and a `BUILD_FAISS_INDEX` env flag gap in `docker-compose.test.jetson.yml`). None of these had been hit before because no one had actually executed the workstation Docker stack end-to-end since it was authored — the colocated Jetson Woodpecker agent was the only test environment that ever ran. Maintaining the divergent x86 path was producing only false-negative signal and engineering time, never honest test coverage.
|
||||
|
||||
**What changed**: the two-tier execution profile is retired in favour of a Jetson-only policy. Source of truth: `_docs/02_document/tests/environment.md` (active-policy banner at top + superseding "Decision (2026-05-20)" in § Test Execution). CI policy updated in `_docs/04_deploy/ci_cd_pipeline.md` and `_docs/02_document/deployment/ci_cd_pipeline.md`. Local-development entry point: `scripts/run-tests-jetson.sh` against the configured `jetson-e2e` SSH alias. The general rule: **if you have one environment that matches production and one that doesn't, don't maintain both — maintain the one that matches.**
|
||||
|
||||
## 2026-05-20 — [process] Before classifying a per-task FAIL, probe cross-cutting state the task depends on (registries, factories, baselines)
|
||||
|
||||
**Trigger**: cycle-1 Step 7 Product Implementation Completeness Gate originally classified AZ-332 + AZ-333 as FAIL and proposed two per-strategy remediation tasks (AZ-589 + AZ-590). Post-mortem found the actual gap was the empty central `_STRATEGY_REGISTRY` — a cross-cutting concern that should have produced **one** task (AZ-591), not two. AZ-589 + AZ-590 closed Won't Fix.
|
||||
|
||||
Reference in New Issue
Block a user