[autodev] Update configuration and documentation for cycle-1
ci/woodpecker/push/02-build-push Pipeline failed

- Enhanced `.env.example` with detailed CMake build flags and replay-mode strategy flags for development and CI environments.
- Updated `.gitignore` to include a new deploy rollback bookmark.
- Revised `_docs/_autodev_state.md` to reflect the current task status and steps.
- Added new lessons to `_docs/LESSONS.md` regarding testing and architectural improvements.
- Documented changes in `_docs/02_document/deployment/ci_cd_pipeline.md` to reflect the relaxed OpenCV version pin.
- Updated test data documentation in `_docs/02_document/tests/test-data.md` to clarify fixture usage and paths.

This commit continues the cycle-1 documentation sync and addresses various configuration updates for improved clarity and functionality.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-20 08:05:35 +03:00
parent ab92946833
commit bf13549b32
34 changed files with 3689 additions and 42 deletions
@@ -0,0 +1,242 @@
# GPS-Denied Onboard — Deployment Status Report
> Generated by `/autodev` greenfield Step 16 (Deploy) — Step 1 status & env
> assessment, 2026-05-19. Inputs: `_docs/02_document/architecture.md`,
> 14 component specs in `_docs/02_document/components/`,
> `_docs/00_problem/restrictions.md`, existing root-level Docker artefacts
> (`docker-compose.yml`, `docker-compose.test*.yml`, `docker/*.Dockerfile`),
> and `.env.example`.
## Deployment Readiness Summary
| Aspect | Status | Notes |
|--------|--------|-------|
| Architecture defined | ✅ | `architecture.md` v1 + 11 ADRs; vision section is the spine, no drift detected |
| Component specs complete | ✅ | 14 components (C1C8, C10C13) with description.md present |
| Infrastructure prerequisites met | ⚠️ Partial | Tier-1 (workstation Docker + Postgres 16 + mock-sat) ready and committed; **parent-suite CI/CD (Woodpecker + Gitea Packages registry + Caddy TLS) already exists** at `../_infra/ci/` — this submodule needs to author `.woodpecker/01-test.yml` + `.woodpecker/02-build-push.yml` per the suite-mandated two-workflow contract; Tier-2 Jetson runner availability tracked as cycle-1 follow-up (AZ-592 / AZ-593) |
| External dependencies identified | ✅ | parent-suite `satellite-provider` (read pre-flight, write post-landing via planned D-PROJ-2), parent-suite `flights` REST, ArduPilot Plane FC (signed MAVLink 2.0), iNav FC (MSP2), QGroundControl, nav camera (ADTi 20MP) |
| Blockers | 4 | (1) **Cross-cutting ADR-005 ↔ parent-suite Jetson Docker compose contradiction** — see "Cross-Cutting Decision" section below; (2) D-PROJ-2 ingest endpoint planned, parent-suite work; (3) AZ-592/AZ-593 Tier-2 wiring deferred to follow-up cycle; (4) D-CROSS-CVE-1 opencv pin replay deferred on upstream `gtsam` numpy-2 wheels |
The system is **deploy-plannable today** at the Tier-1 / dev level — but
production Tier-2 delivery shape (bare JetPack per ADR-005 vs Docker
container under the parent-suite Watchtower flow) needs a user decision
before the deploy plan steps 27 can be authored without drift. See the
new "Cross-Cutting Decision" section below.
## Parent-Suite Context (Authoritative Discovery)
This submodule lives inside the **Azaion suite meta-repo** at `../`. The
suite already has a fully-installed CI/CD + production-deploy stack the
GPS-Denied Onboard plan **did not previously account for**. Citations below.
| Suite artefact | Path | What it mandates for this submodule |
|----------------|------|--------------------------------------|
| Woodpecker CI + Gitea Packages + Caddy TLS | `../_infra/ci/README.md` | Two-workflow per-repo pattern: `.woodpecker/01-test.yml` (test on push/PR) + `.woodpecker/02-build-push.yml` (build+push, gated `depends_on: [01-test]`, multi-arch matrix). All images go to `${REGISTRY_HOST}/azaion/<service>:<branch>-arm` (e.g., `git.azaion.com/azaion/gps-denied-onboard:dev-arm`). Registry secrets (`registry_host`, `registry_user`, `registry_token`) are already provisioned as Woodpecker global secrets — this submodule consumes them. |
| Jetson production compose | `../_infra/deploy/jetson/docker-compose.yml` | The fielded Jetson runs **9 application services + Postgres + Watchtower** via `docker compose up -d`. One of those services is already declared: `gps-denied-onboard: image: ${REGISTRY_HOST}/azaion/gps-denied-onboard:${BRANCH:-main}-arm`, `runtime: nvidia`, port `5040:8080`, env `AUTOPILOT_URL: http://autopilot:8080`, `MODELS_DIR: /data/models`. **This contradicts ADR-005's "bare JetPack, no Docker" stance** — see Cross-Cutting Decision below. |
| Flight-state safety gate | `../_infra/deploy/jetson/README.md` → "Flight-state convention" | All on-Jetson model syncs and Watchtower-driven container restarts are gated by `/run/azaion/in-flight` (written by `autopilot` service on arm/disarm). Any GPS-Denied Onboard production deploy on Jetson must honour the same flag. |
| Audit logging | Same README → "Audit: what is this device running?" | OCI labels (`org.opencontainers.image.revision/created/source`) + per-service env `AZAION_REVISION=$CI_COMMIT_SHA` + journald-captured `AZAION_UPDATE_EVENT` lines. Every submodule's Dockerfile must accept `--build-arg CI_COMMIT_SHA` and stamp the OCI labels + `ENV AZAION_REVISION`. |
| Suite-level e2e | `../.woodpecker/suite-e2e.yml` | Manual / nightly cron pipeline that brings up `_infra/deploy/jetson/docker-compose.yml` + `e2e/docker-compose.suite-e2e.yml`; downstream signal only, does not gate this submodule. Already references `gps-denied-onboard` as one of the services pulled. |
| Outstanding suite follow-up #4 | `../_infra/ci/README.md` → Follow-ups | "Missing Dockerfiles for Jetson edge services. `detections-semantic/`, `gps-denied-onboard/`, `gps-denied-desktop/` have no `Dockerfile` / `Dockerfile.jetson` today." This submodule's `docker/companion-tier1.Dockerfile` exists for Tier-1; **a `Dockerfile.jetson` for the arm64 Watchtower image does not exist yet**. |
## Cross-Cutting Decision — ADR-005 vs Parent-Suite Jetson Docker Compose
**The conflict in one paragraph.** ADR-005 in `architecture.md` says:
"Tier-2 (Jetson) does NOT use Docker — TensorRT INT8 calibration caches
and `jetson-stats` thermal telemetry are most reliable without a container
layer, per D-C7-9 + D-C10-6. The deployed image on the Jetson is a
JetPack-based system image with the deployment binary preinstalled." The
parent suite's `_infra/deploy/jetson/docker-compose.yml` declares
`gps-denied-onboard` as a Docker service pulled by Watchtower, with
`runtime: nvidia` for GPU access, alongside 8 other suite services. Both
cannot be the production deploy path simultaneously — this needs a user
call before Step 2 (Containerization) writes the production
containerization plan.
**Resolution options:**
| Option | What it means | Implications |
|--------|---------------|--------------|
| **A** | Keep ADR-005 — GPS-Denied Onboard is **NOT** in the Jetson Docker compose. It runs as a bare-metal systemd service on the same Jetson, beside the Docker stack. Watchtower does not manage it. | Parent-suite `_infra/deploy/jetson/docker-compose.yml` must drop the `gps-denied-onboard` service (a parent-suite edit). This submodule ships a JetPack-flashable tarball + systemd unit instead of an image. Deploy procedure becomes operator-side `apt`-/`tarball`-install, not `docker compose up`. CI builds a release tarball, not an image. Updates lose the Watchtower + journald audit chain — we need an equivalent. |
| **B** | Reverse ADR-005 — GPS-Denied Onboard ships a `Dockerfile.jetson` and runs as a Docker container under the parent-suite Watchtower flow. The ADR is rewritten to "Docker on Jetson with `runtime: nvidia` + explicit calibration-cache + jetson-stats volume mounts to preserve the D-C7-9 / D-C10-6 properties". | Suite follow-up #4 is closed by this submodule. CI fits the suite two-workflow pattern. Flight-state gate honoured via `/run/azaion/in-flight` volume mount. TensorRT INT8 calibration cache + jetson-stats telemetry must be validated under Docker (not just bare JetPack) — Step 2 of this deploy plan owns that validation; if it fails, fall back to (A). |
| **C** | Hybrid — GPS-Denied Onboard ships **both** a Docker image (for Tier-1 + dev + e2e + replay) **and** a JetPack bare-metal artefact (for Tier-2 production). | Two release artefacts to maintain; two CI lanes; matches ADR-005 + ADR-002 mechanism for "binary tracks". Parent-suite compose still drops the Watchtower-managed `gps-denied-onboard` service (operator runs the bare-metal artefact alongside the Docker stack). |
**Autodev-resolved (2026-05-19 19:09 UTC+3): Option B.** The user
explicitly skipped the structured BLOCKING gate, directing the autodev to
continue with available information. Option B is selected because:
1. **Existence proof on the same platform.** The parent suite's
`detections` service already runs as a Docker container on the Jetson
with `runtime: nvidia` (`Dockerfile.jetson` + suite production compose).
GPU access + INT8-class inference in Docker on Jetson is a working
pattern in this suite, not a hypothetical.
2. **Suite follow-up #4** in `../_infra/ci/README.md` explicitly lists
"Missing Dockerfiles for Jetson edge services. … `gps-denied-onboard/`"
— the parent-suite operator expects this submodule to ship a
`Dockerfile.jetson` and join the Watchtower flow.
3. **Audit + flight-gate chain reuse.** Option B inherits
`AZAION_UPDATE_EVENT` journald audit + `/run/azaion/in-flight`
flight-state gate + per-flight ephemeral secret rotation patterns
without re-inventing them at bare-metal level.
4. **ADR-005 concerns are validatable in Step 2.** The two technical
concerns ADR-005 cited (TensorRT INT8 calibration cache stability +
`jetson-stats` thermal telemetry access) become explicit Step 2
validation gates: model-cache mounted as a named Docker volume (same
pattern `detections` uses for `model-cache:/data/models`); jetson-stats
accessed via `runtime: nvidia` + the standard nvidia container toolkit
device passthrough. **If either validation fails in Step 2**, the
autodev falls back to Option A and reopens this section.
5. **Step 3 CI/CD authoring is straightforward** under Option B — the
suite already provides the two-workflow `.woodpecker/` templates and
registry secrets; this submodule plugs into the existing pipeline.
**To reverse this decision later**: edit this section to record the new
choice, restore ADR-005's bare-JetPack language in `architecture.md`, and
re-run `/autodev` — Step 2 will detect the change via the rewritten
section and rebuild the containerization plan accordingly.
**Required architecture follow-up under Option B**: the `architecture.md`
ADR-005 paragraph "Container scope: …Tier-2 (Jetson) does NOT use Docker"
becomes inconsistent with this decision. Step 2 of the deploy plan will
draft the ADR-005 amendment (or replacement ADR-012 — "Docker on Jetson
with explicit calibration-cache + jetson-stats passthrough") and the
amendment lands in Step 12 (Test-Spec Sync / Update Docs equivalent)
output. Recording the architectural drift here so it is not lost.
The originally-listed registry decision is **already settled by the
parent suite** — `${REGISTRY_HOST}` is the Gitea Packages registry behind
Caddy TLS (`git.azaion.com` per the example in
`../_infra/deploy/jetson/README.md`); no operator choice needed for this
submodule.
## Component Status
> Docker-ready column means: does the component run inside the Tier-1
> Docker images? Tier-2 production deploys via JetPack image flash, not
> Docker (ADR-005); that column is N/A for Tier-2-only paths.
| Component | State | Docker-ready (Tier-1) | Notes |
|-----------|-------|-----------------------|-------|
| C1 — VIO (`c1_vio`) | ✅ implemented + tested (operational default = `KltRansac` AZ-334) | yes | `Okvis2`/`VinsMono` ship as facade-only — AZ-332/AZ-333 BLOCKED on Tier-2 prereqs; follow-ups AZ-592/AZ-593 in backlog (ADR-001 cycle-1 note). `_STRATEGY_REGISTRY` registers all three slots; selecting an unlinked strategy raises `StrategyNotLinkedError` |
| C2 — VPR (`c2_vpr`) | ✅ implemented + tested | yes | `UltraVPR` primary; `MegaLoc`/`MixVPR`/`SelaVPR`/`EigenPlaces`/`NetVLAD` secondaries behind `BUILD_*` flags per ADR-002 |
| C2.5 — Re-rank (`c2_5_rerank`) | ✅ implemented + tested | yes | inlier-count re-rank top-K=10 → top-N=3 |
| C3 — Matcher (`c3_matcher`) | ✅ implemented + tested | yes | `DISK+LightGlue` primary; `ALIKED+LightGlue` / `XFeat` secondaries |
| C3.5 — AdHoP (`c3_5_adhop`) | ✅ implemented + tested | yes | conditional refinement; `passthrough` baseline path |
| C4 — Pose (`c4_pose`) | ✅ implemented + tested | yes | OpenCV `solvePnPRansac` + GTSAM Marginals; D-CROSS-LATENCY-1 auto-degrade |
| C5 — State (`c5_state`) | ✅ implemented + tested | yes | GTSAM iSAM2 + `IncrementalFixedLagSmoother`; ESKF baseline behind `BUILD_STATE_ESKF` |
| C6 — Tile cache (`c6_tile_cache`) | ✅ implemented + tested | yes | Postgres 16 btree spatial index + filesystem tiles + FAISS HNSW descriptor index |
| C7 — Inference (`c7_inference`) | ✅ implemented + tested (Tier-1 PyTorch FP16); Tier-2 TensorRT path pinned | yes (PyTorch FP16); N/A (TensorRT runs on bare JetPack) | `INFERENCE_BACKEND={tensorrt|pytorch_fp16|onnx_trt_ep}`; ONNX+TRT EP fallback |
| C8 — FC adapter (`c8_fc_adapter`) | ✅ implemented + tested | yes | `pymavlink` ArduPilot Plane (signed) + `MSP2` iNav (unsigned, accepted risk); `MavlinkTransport` Protocol seam (Serial / Noop for replay per ADR-011) |
| C10 — Provisioning (`c10_provisioning`) | ✅ implemented + tested | yes (operator-orchestrator image) | engine + descriptor + manifest build with SHA-256 content-hash gate |
| C11 — Tile Manager (`c11_tilemanager`) | ✅ implemented + tested | yes (operator-orchestrator image ONLY) | airborne image MUST NOT link C11 (ADR-004 process-level isolation); CI SBOM-diff + runtime self-check + NFT-SEC-02 egress test enforce |
| C12 — Operator orchestrator (`c12_operator_orchestrator`) | ✅ implemented + tested | yes (operator-orchestrator image ONLY) | `FlightsApiClient` + `PostLandingUploadOrchestrator` + `OperatorReLocService` |
| C13 — FDR (`c13_fdr`) | ✅ implemented + tested | yes | ≤ 64 GB / flight ring; `flight_footer` record drives C12 post-landing gate |
### Binary tracks (three, per ADR-002 + ADR-011)
| Binary | Image / target | Contents | Where it runs |
|--------|---------------|----------|---------------|
| `airborne` | Tier-2: bare JetPack 6.2 system image / Tier-1: `gps-denied-onboard/companion:dev` Docker image | C1C8 + C13 + replay strategies (`BUILD_VIDEO_FILE_FRAME_SOURCE`, `BUILD_TLOG_REPLAY_ADAPTER`, `BUILD_REPLAY_SINK_JSONL` ON); same image runs live and replay modes (config-selected) | Jetson Orin Nano Super (prod); workstation Docker (dev/CI) |
| `research` | Tier-1 Docker / Tier-2 bare JetPack | airborne contents + every non-default strategy linked (IT-12 comparative study) | Lab Jetson, CI Tier-2 jobs |
| `operator-orchestrator` | Tier-1 Docker image `gps-denied-onboard/operator-orchestrator:dev` | C10 + C11 + C12; ships with mock-suite-sat-service compose for offline tests | Operator workstation |
## External Dependencies
| Dependency | Type | Required For | Status |
|------------|------|--------------|--------|
| PostgreSQL 16 | Database (C6 tile + descriptor metadata) | All deployments | ✅ Tier-1: `db` service in `docker-compose.yml`; Tier-2: native Postgres on operator workstation + Jetson (sized for ≤ 10 GB cache budget) |
| Filesystem `./tiles/{zoomLevel}/{x}/{y}.jpg` | Tile binary store mirroring `satellite-provider` on-disk layout | All deployments (C6) | ✅ Tier-1: `tile-data` volume; Tier-2: NVM partition (≥ 10 GB) |
| Parent-suite `satellite-provider` (.NET 8 REST + on-disk tiles) | External service | Operator workstation only (pre-flight `TileDownloader` via C11; post-landing `TileUploader` via C11/C12) | ✅ pre-flight read path is live; ⚠️ post-landing POST contract (D-PROJ-2) **planned**, parent-suite work — see `_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md` |
| Parent-suite `flights` REST service (.NET 8) | External service | Operator workstation only (C12 reads `Flight` DTO via `FlightsApiClient`) | ✅ contract owned by parent-suite; offline `--flight-file` path implemented (AZ-489) as fallback |
| ArduPilot Plane FC | MAVLink 2.0 over UART/USB (signed) | Production (airborne ↔ FC) | ✅ adapter implemented; signing handshake validated via NFT-SEC-03; per-flight key rotation logged to FDR |
| iNav FC | MSP2 over UART (unsigned, accepted risk) | Production (airborne ↔ FC) | ✅ adapter implemented; no signing — documented residual risk |
| QGroundControl (GCS) | MAVLink 2.0 12 Hz downsampled summary | Production (operator monitoring) | ✅ outbound encoder + STATUSTEXT path covered |
| Nav camera (ADTi 20MP 20L V1) | Camera SDK / V4L2 over USB / MIPI-CSI / GigE | Production (airborne) | ⚠️ live driver per deployed lens module — calibration JSON (`adti20.json`) is operator-acquired per D-PROJ-1 (hybrid factory + checkerboard); `adti26.json` test-fixture used in dev / CI |
| GitHub Actions runner (Tier-1) | CI | Build + lint + unit + most integration + Tier-1 e2e | ✅ GitHub-hosted x86_64 runner; pinned actions per `_docs/02_document/deployment/ci_cd_pipeline.md` |
| Self-hosted Jetson runner (Tier-2) | CI | AC-bound NFTs (NFT-PERF-* + NFT-LIM-* + IT-12) | ⚠️ runner availability tracked as a risk-register entry (ADR-005). Cycle-1 perf probe ran Tier-1 only — NFT-PERF-01/03 Tier-2 hardware required, NFT-PERF-02/04 SITL replay fixture pending AZ-595 |
## Infrastructure Prerequisites
| Prerequisite | Status | Action Needed |
|--------------|--------|---------------|
| Container registry | ✅ **Already set by parent suite** | `${REGISTRY_HOST}` (Gitea Packages behind Caddy TLS, e.g. `git.azaion.com`). Images: `${REGISTRY_HOST}/azaion/gps-denied-onboard:<branch>-arm`. Woodpecker global secrets `registry_host` / `registry_user` / `registry_token` already provisioned per `../_infra/ci/README.md`. **No operator choice needed.** |
| Cloud account | N/A | No cloud orchestration. The CI/CD server itself is a self-hosted Jetson colocated with the registry — see `../_infra/ci/README.md` → "Architecture". |
| DNS configuration | ✅ **Already set by parent suite** | `REGISTRY_DOMAIN` + `WOODPECKER_DOMAIN` resolve to the CI host's public IP. Operator workstation reaches `satellite-provider` over LAN / VPN; no public DNS for the airborne / operator side from this submodule. |
| SSL certificates | ✅ **Already set by parent suite** (Caddy + Let's Encrypt / internal / external-file modes) | Suite operator chooses the mode in `../_infra/ci/.env`. The companion has no inbound listeners (NFT-SEC-05 in-flight egress lockdown). |
| CI/CD platform | ⚠️ Suite-mandated (Woodpecker CI two-workflow pattern); **submodule pipeline files missing** | This submodule has **no `.woodpecker/` folder yet**. Suite follow-up #4 in `../_infra/ci/README.md` confirms `gps-denied-onboard` is one of the services awaiting CI integration. Step 3 of this deploy plan must author `.woodpecker/01-test.yml` (Python `pytest` + Tier-1 e2e via the existing `docker-compose.test.yml`) and `.woodpecker/02-build-push.yml` (multi-arch matrix → `${REGISTRY_HOST}/azaion/gps-denied-onboard:<branch>-arm`). The existing pre-cycle-1 `_docs/02_document/deployment/ci_cd_pipeline.md` was written against an assumed GitHub Actions runner — Step 3 must rewrite it against the actual suite Woodpecker pattern. |
| Secret manager | ⚠️ Per-flight ephemeral, no external manager | Per-flight MAVLink signing key + per-flight onboard signing key are **generated at takeoff load**, rotated per flight, logged to FDR. Pre-flight `satellite-provider` API key lives on the operator workstation only; never written to companion image. **No external secret manager required** for the companion. For the operator workstation, the operator's local credential store / OS keyring is sufficient. |
| Image build host | ⚠️ Depends on the Cross-Cutting Decision above | Option B (Docker on Jetson) requires arm64 build agents (already provisioned at the suite level — Jetson colocated agent + optional remote amd64). Option A (bare JetPack) requires a JetPack 6.2 SDK build host with `pyproject.toml` wheel build + native CMake build; CI lane is different (release-tarball lane, not registry push). |
| JetPack 6.2 system image | ⚠️ Required for Tier-2 hardware regardless of option | Operator burns the JetPack 6.2 + Jetson Linux base image; Step 6 documents the procedure. Under Option A this image hosts a bare-metal install; under Option B it hosts Docker + `runtime: nvidia` + the suite-level compose. |
| Flight-state gate (`/run/azaion/in-flight`) | ⚠️ Suite-mandated for any Watchtower-managed production deploy | Under Option B, the GPS-Denied Onboard image must accept the same volume mount + honour the flag. Under Option A, the bare-metal systemd unit must also gate on it (the parent-suite `autopilot` service still writes the flag). Step 6 documents this. |
| Audit / OCI labels (`AZAION_REVISION`, `org.opencontainers.image.revision/created/source`) | ⚠️ Suite-mandated under Option B; recommended under Option A | The suite `journalctl -g AZAION_UPDATE_EVENT` audit chain depends on these. Step 2 must add them to the Dockerfile under Option B; Step 7 deployment scripts must emit an equivalent under Option A. |
## Deployment Blockers
| Blocker | Severity | Resolution |
|---------|----------|------------|
| **ADR-005 ↔ parent-suite Jetson Docker compose contradiction** | High (blocks Step 2 Containerization) | See "Cross-Cutting Decision" section above. User picks A / B / C; the choice determines whether Step 2 writes a Docker-on-Jetson plan or a bare-metal JetPack plan. |
| **D-PROJ-2** — parent-suite `satellite-provider` ingest endpoint + voting layer not yet implemented | Medium (production-blocking for post-landing upload only; airborne path is unaffected) | Parent-suite work tracked in `_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md`. The onboard side ships against the real service (download) + e2e-test-only `mock-suite-sat-service` fixture (upload). Post-landing upload tool keeps batches queued locally until D-PROJ-2 lands. |
| **AZ-592 / AZ-593** — Tier-2 OKVIS2 / VINS-Mono wiring (build env + Jetson + DBoW2 vocab) | Medium (no impact on cycle-1 production deploy — operational default is `KltRansac` AZ-334) | Both parked in `_docs/02_tasks/backlog/`; follow-up cycle (ADR-001 cycle-1 note). Cycle-1 deployment ships with `KltRansac` as the operational `VioStrategy`. |
| **D-CROSS-CVE-1**`opencv-python ≥ 4.12.0` pin deferred on `gtsam==4.2` numpy<2 ABI block | Low (CVE-2025-53644 re-validated against 4.11.0.86 — no advisory ties it to the current pin band; NFT-SEC-04 fuzz fixture is the executable confirmation) | Replay condition tracked in `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`. Replay lands when upstream `gtsam` ships numpy-2 wheels (or an alternative SE(3) backend) — at that point also bump `cryptography ≥ 46.0.7` per Phase 1 finding F1. |
## Required Environment Variables
> Production-required variables on the companion image are the smaller set
> below (12 entries). The operator-orchestrator image consumes the same
> set plus the C12-specific knobs documented in
> `_docs/02_document/components/13_c12_operator_orchestrator/description.md`.
| Variable | Purpose | Required In | Default (Dev) | Source (Staging / Prod) |
|----------|---------|-------------|---------------|--------------------------|
| `GPS_DENIED_FC_PROFILE` | Selects FC adapter at composition root: `ardupilot_plane \| inav` | Airborne, operator-orchestrator | `ardupilot_plane` | Per-flight config from the operator |
| `GPS_DENIED_TIER` | Runtime tier gate: `1`=workstation/CI, `2`=Jetson production | All | `1` | `1` for CI containers, `2` baked into the JetPack image |
| `DB_URL` | Postgres connection (C6 tile + descriptor metadata) | All | `postgresql://gps_denied:dev@db:5432/gps_denied` | Operator workstation: local Postgres credentials; Jetson production: local Postgres init script with random per-host password |
| `SATELLITE_PROVIDER_URL` | Pre-flight tile download endpoint | Operator-orchestrator only (never on airborne) | `http://mock-sat:5100` | Operator workstation env / VPN-resolved hostname; **must be empty on airborne** (defence-in-depth NFT-SEC-05) |
| `CAMERA_CALIBRATION_PATH` | Path to JSON camera calibration loaded at startup | Airborne, operator-orchestrator | `/fixtures/calibration/adti26.json` | Production: `/etc/gps-denied/calibration/adti20.json` (operator-acquired per D-PROJ-1) |
| `LOG_LEVEL` | Structured log level (`DEBUG \| INFO \| WARNING \| ERROR`) | All | `DEBUG` | Production: `INFO` |
| `LOG_SINK` | Structured log destination (`console \| journald \| fdr`) | All | `console` | Production: `fdr` (companion); `journald` (operator workstation) |
| `MAVLINK_SIGNING_KEY` | Per-flight MAVLink 2.0 signing key path | Airborne (ArduPilot profile) | `tests/fixtures/mavlink_signing/dev_key` | Production: per-flight ephemeral key generated at takeoff load, rotated per flight, logged to FDR (Principle #7) |
| `INFERENCE_BACKEND` | Selects C7 backend (`tensorrt \| pytorch_fp16 \| onnx_trt_ep`) | Airborne, operator-orchestrator | `pytorch_fp16` | Tier-2 production: `tensorrt`; Tier-1 CI: `pytorch_fp16` |
| `FDR_PATH` | C13 ring writer location | Airborne | `/var/lib/gps-denied/fdr` | Production: `/var/lib/gps-denied/fdr` on the companion NVM partition (≥ 64 GB) |
| `TILE_CACHE_PATH` | C6 filesystem tile root | Airborne, operator-orchestrator | `/var/lib/gps-denied/tiles` | Production: `/var/lib/gps-denied/tiles` on the companion NVM (≥ 10 GB) |
| `BUILD_VINS_MONO`, `BUILD_SALAD`, `BUILD_C11_TILE_MANAGER` | Build-time strategy / component gating (ADR-002) | Build host | `OFF` for deployment binary | `OFF` on airborne (`BUILD_C11_TILE_MANAGER` MUST stay OFF per ADR-004); `ON` on research binary |
| `BUILD_VIDEO_FILE_FRAME_SOURCE`, `BUILD_TLOG_REPLAY_ADAPTER`, `BUILD_REPLAY_SINK_JSONL` (optional) | Replay-mode strategy gating (ADR-011) | Replay-capable images | unset (defaults to ON in the airborne / research binaries) | `ON` in airborne + research; explicitly set in `docker-compose.test*.yml` for CI |
| `BUILD_DEV_STATIC_KEY` (optional, dev-only) | Gates the AP adapter's `signing_key_source='dev_static'` path | Dev / CI containers only | unset / `OFF` | **MUST stay OFF on production images.** |
| `BUILD_STATE_ESKF` (optional) | Links the ESKF state estimator (mandatory simple-baseline) | Research binary | unset / `OFF` | `ON` on research binary; `OFF` on airborne |
### Sensitive variables — never committed
| Variable | Why |
|----------|-----|
| `MAVLINK_SIGNING_KEY` (real key) | Per-flight key, generated at takeoff. `.env.example` points at the dev test fixture only. |
| Real Postgres credentials | The committed `DB_URL` uses the local Docker `dev` password. Production credentials live on the host outside the image. |
| `SATELLITE_PROVIDER_URL` API token (when D-PROJ-2 lands) | Per-flight onboard signing key carried with each uploaded tile; never written to the companion image. |
## .env Files Created
- `.env.example` — committed to VCS, contains all variable names with placeholder values (extended with optional / build-flag rows in this step).
- `.env` — git-ignored (`.gitignore` line 64 confirms), contains development defaults that mirror `docker-compose.yml`. Safe to use for `docker compose up`, `python -m gps_denied_onboard.healthcheck`, and the existing test runner scripts.
- `.gitignore` already excludes `.env`, `.env.local`, and `*.key` while allow-listing the dev-fixture signing key (`!tests/fixtures/mavlink_signing/dev_key`). No changes needed.
## Pre-existing Deployment Artefacts (Discovered)
This is **not** a from-scratch deployment plan — the cycle-1 implementation already shipped working containerization scaffolding. Subsequent deploy-plan steps will harmonise these against the documents being produced rather than recreate them.
| File | Purpose | Status |
|------|---------|--------|
| `docker-compose.yml` | Tier-1 dev compose: `companion` + `operator-orchestrator` + `mock-sat` + `db` | ✅ working, healthchecks present |
| `docker-compose.test.yml` | Tier-1 e2e test compose (replay mode flags ON) | ✅ working |
| `docker-compose.test.jetson.yml` | Tier-2 Jetson e2e test compose | ✅ working |
| `e2e/docker/docker-compose.test.yml`, `e2e/docker/docker-compose.tier2-bridge.yml` | Suite-level e2e harness | ✅ owned by the e2e harness, referenced by `_docs/02_document/deployment/ci_cd_pipeline.md` |
| `docker/companion-tier1.Dockerfile`, `docker/operator-orchestrator.Dockerfile`, `docker/mock-suite-sat-service.Dockerfile` | Per-binary Dockerfiles | ✅ in tree (referenced by compose files) |
| `tests/e2e/Dockerfile`, `tests/e2e/Dockerfile.jetson` | Test runner images | ✅ in tree |
| `e2e/fixtures/tile-cache-builder/Dockerfile`, `e2e/fixtures/mock-suite-sat/Dockerfile`, `e2e/runner/Dockerfile` | Test fixtures | ✅ in tree |
| `scripts/run-tests.sh`, `scripts/run-tests-jetson.sh`, `scripts/run-performance-tests.sh` | Test entry points | ✅ in tree (Step 7 will add `deploy.sh`, `pull-images.sh`, `start-services.sh`, `stop-services.sh`, `health-check.sh`) |
| `_docs/02_document/deployment/ci_cd_pipeline.md` | Pre-existing CI/CD doc | ✅ exists (per Step 12 Test-Spec Sync output); Step 3 will reconcile against this status report |
## Next Steps
1. **User confirms this status report** (BLOCKING gate per the deploy skill Step 1).
2. **User picks the Cross-Cutting Decision option (A / B / C)** — this determines the production Tier-2 delivery shape and is required input for Step 2.
3. **Proceed to Step 2 (Containerization)** — under Option A: write the bare-metal JetPack production plan (tarball + systemd unit + flight-state gate) and the Tier-1 Docker plan (existing `docker-compose.yml`) separately. Under Option B: author `docker/Dockerfile.jetson` matching the suite-mandated OCI labels + `AZAION_REVISION` build-arg, and reconcile ADR-005 in `architecture.md` to the new "Docker on Jetson" stance. Under Option C: both artefacts, two CI lanes.
4. **Step 3 (CI/CD pipeline) is no longer "pick a platform"** — author `.woodpecker/01-test.yml` + `.woodpecker/02-build-push.yml` per the suite two-workflow contract (`../_infra/ci/README.md` → "Pipeline configuration — two-workflow contract"). Rewrite `_docs/02_document/deployment/ci_cd_pipeline.md` against the actual Woodpecker + Gitea Packages stack instead of the previously-assumed GitHub Actions runner.
5. After Step 3, auto-chain through Steps 47 (environment strategy, observability, deployment procedures, deployment scripts) per the deploy skill's workflow. Step 6 procedures must include the flight-state gate (`/run/azaion/in-flight`) and the audit-log chain (`AZAION_UPDATE_EVENT` via journald) regardless of which option wins above.