mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 17:01:13 +00:00
64542d32fc
Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
179 lines
16 KiB
Markdown
179 lines
16 KiB
Markdown
# GPS-Denied Onboard — Environment Strategy
|
|
|
|
> Date: 2026-05-09 (Plan Phase 2c — initial draft).
|
|
> Inputs: `_docs/02_document/architecture.md` § 3 (Deployment Model) + § 7 (Security Architecture); `_docs/02_document/data_model.md` § 5 (Seed Data); `_docs/00_problem/restrictions.md`; ADR-002, ADR-004, ADR-005.
|
|
|
|
## Environments
|
|
|
|
This project has **six environments**, not the canonical three (dev / staging / prod). The asymmetry reflects ADR-005 (Tier-1 / Tier-2) and ADR-004 (process-level isolation between airborne companion image and operator-side upload tool).
|
|
|
|
| Environment | Purpose | Infrastructure | Data Source |
|
|
|-------------|---------|---------------|-------------|
|
|
| `dev-tier1` | Local developer iteration; lint + unit + most integration tests | Workstation (Linux x86_64; NVIDIA GPU optional); Docker compose | Test fixtures (`adti26.json` calibration; `tests/fixtures/flight_derkachi/`) + `mock-suite-sat-service` |
|
|
| `dev-tier2` | Hardware-bound developer checks | Jetson Orin Nano Super dev kit on developer's desk; bare JetPack | Test fixtures + locally-mirrored `satellite-provider` |
|
|
| `staging-tier1` | CI runs that don't require Jetson hardware | GitHub-hosted runner (x86_64); Docker | Sealed test fixtures committed to the repo |
|
|
| `staging-tier2` | CI runs that require Jetson (AC-bound NFT-PERF-*, NFT-LIM-*, NFT-RES-*, NFT-SEC-*, IT-12) | Self-hosted Jetson runner; bare JetPack 6.2 | Same sealed fixtures + cached TRT engines per manifest hash |
|
|
| `production` | Deployed onboard companion image on a UAV | Jetson Orin Nano Super (pinned); bare JetPack 6.2; **no inbound network listening; no outbound network egress in flight** (NFT-SEC-05) | Operator-staged pre-flight cache + per-flight in-flight orthorectified tiles |
|
|
| `production-operator-workstation` | Pre-flight tile download (C11 `TileDownloader`); pre-flight cache artifact build (C10 driven by C12); post-landing tile upload (C11 `TileUploader`); FDR retrieval | Operator's Linux workstation; Docker for `satellite-provider` mirror | Operator-managed `satellite-provider` instance + the companion's NVM contents post-landing |
|
|
|
|
Notes:
|
|
|
|
- **No "staging" deployment of the companion**. Staging is purely a CI mode — there is no live staging Jetson UAV. Production is one-step from CI release artifacts → operator workstation → flashed Jetson.
|
|
- **The airborne companion never sees `staging-*` environments at runtime**. Staging is exclusively a CI gating concept.
|
|
- **The operator workstation is its own environment** with its own secrets posture (operator login + workstation hardening) — see § Secrets Management.
|
|
|
|
## Environment Variables
|
|
|
|
Variables are categorized by which environment(s) consume them. Production has the **shortest** required list because in-flight network egress is forbidden — most of the typical "service URL" variables disappear.
|
|
|
|
### Required variables — companion runtime (all environments)
|
|
|
|
| Variable | Purpose | dev-tier1 default | dev-tier2 default | production source |
|
|
|---|---|---|---|---|
|
|
| `DB_URL` | Local PostgreSQL connection | `postgresql://gps_denied:dev@db:5432/gps_denied` | `postgresql://gps_denied:dev@localhost:5432/gps_denied` | `postgresql://gps_denied@/gps_denied?host=/var/run/postgresql` (UNIX socket on Jetson, no password) |
|
|
| `CAMERA_CALIBRATION_PATH` | Camera calibration JSON path (Principle #1, data_model.md § 2.6) | `/fixtures/calibration/adti26.json` | `/fixtures/calibration/adti26.json` | `/etc/gps-denied/calibration/adti20.json` (per-deployed-unit, post D-PROJ-1 hybrid) |
|
|
| `GPS_DENIED_FC_PROFILE` | `ardupilot_plane` or `inav` | `ardupilot_plane` | per developer's bench setup | per UAV airframe (set via JetPack image's `/etc/gps-denied/runtime.yaml`) |
|
|
| `GPS_DENIED_VIO_STRATEGY` | `okvis2`, `vins_mono`, `klt_ransac` (ADR-001 startup-locked) | `okvis2` | `okvis2` | `okvis2` (production-default; pending IT-12 verdict) |
|
|
| `GPS_DENIED_VPR_STRATEGY` | `ultra_vpr`, `mega_loc`, `mix_vpr`, ... | `ultra_vpr` | `ultra_vpr` | `ultra_vpr` (Documentary Lead PRIMARY) |
|
|
| `GPS_DENIED_BUILD_KIND` | `deployment` or `research` (ADR-002; matches the binary's CMake flag set; the runtime validator fails fast if config asks for a strategy not linked into the binary) | `deployment` | `deployment` | `deployment` (research binary is dev-tier2 / staging-tier2 only) |
|
|
| `GPS_DENIED_FDR_RETENTION_DAYS` | FDR ring retention (data_model.md § 2.8) | `7` | `30` | `30` (operator-configurable per UAV) |
|
|
| `LOG_LEVEL` | `DEBUG` / `INFO` / `WARN` / `ERROR` | `DEBUG` | `INFO` | `INFO` (DEBUG is forbidden on the airborne image — context: no operator-readable console, and DEBUG output on FDR ring would inflate beyond 64 GB AC-NEW-3 envelope) |
|
|
| `MAVLINK_SIGNING_KEY_PATH` | Per-flight MAVLink-2.0 signing key file (regenerated at takeoff load; see § Secrets Management) | `/fixtures/keys/dev_mavlink_signing.key` | `/fixtures/keys/dev_mavlink_signing.key` | `/var/lib/gps-denied/per-flight/mavlink_signing.key` (generated at takeoff, deleted on flight ring rollover) |
|
|
| `ONBOARD_TILE_SIGNING_KEY_PATH` | Per-flight onboard tile-signing private key | `/fixtures/keys/dev_onboard_signing.key` | `/fixtures/keys/dev_onboard_signing.key` | `/var/lib/gps-denied/per-flight/onboard_tile_signing.key` (generated at takeoff, deleted on flight ring rollover) |
|
|
|
|
### Required variables — Tier-1 / staging only (NOT on production)
|
|
|
|
| Variable | Purpose | dev-tier1 default | staging-tier1 default | production |
|
|
|---|---|---|---|---|
|
|
| `SATELLITE_PROVIDER_URL` | Where to reach the tile source for pre-flight runs (CI / dev) | `http://mock-sat:5100` | `http://mock-sat:5100` | **NOT SET** — production never reaches a satellite-provider directly while airborne |
|
|
| `MOCK_FAILURE_PROFILE` | Failure injection for `mock-suite-sat-service` | `none` | per CI scenario | n/a |
|
|
| `GPS_DENIED_REPLAY_FIXTURE` | Path to replay corpus | `/fixtures/flight_derkachi` | `/fixtures/flight_derkachi` | n/a |
|
|
|
|
### Required variables — operator workstation
|
|
|
|
| Variable | Purpose | Source |
|
|
|---|---|---|
|
|
| `SATELLITE_PROVIDER_URL` | Operator's local mirror or VPN-reached lab service | Operator config (operator workstation `.env` file) |
|
|
| `SATELLITE_PROVIDER_API_KEY` | TLS + service-internal API key for `satellite-provider` (architecture.md § 7) | Operator workstation secret manager (file or system keyring) — NEVER copied onto the companion image |
|
|
| `COMPANION_DB_URL` | Direct DB connection to the companion (post-landing) | Set transiently when the operator plugs the companion in for FDR retrieval / upload |
|
|
| `OPERATOR_TOOLING_BIND_ADDR` | Pre-flight UI bind address (C12) | `127.0.0.1:8080` (workstation-local; never exposed to network) |
|
|
|
|
### `.env.example`
|
|
|
|
Two example files are committed:
|
|
|
|
`.env.example.dev-tier1`:
|
|
|
|
```env
|
|
# dev-tier1 - workstation Docker compose
|
|
DB_URL=postgresql://gps_denied:dev@db:5432/gps_denied
|
|
SATELLITE_PROVIDER_URL=http://mock-sat:5100
|
|
CAMERA_CALIBRATION_PATH=/fixtures/calibration/adti26.json
|
|
GPS_DENIED_FC_PROFILE=ardupilot_plane
|
|
GPS_DENIED_VIO_STRATEGY=okvis2
|
|
GPS_DENIED_VPR_STRATEGY=ultra_vpr
|
|
GPS_DENIED_BUILD_KIND=deployment
|
|
GPS_DENIED_FDR_RETENTION_DAYS=7
|
|
GPS_DENIED_REPLAY_FIXTURE=/fixtures/flight_derkachi
|
|
LOG_LEVEL=DEBUG
|
|
MAVLINK_SIGNING_KEY_PATH=/fixtures/keys/dev_mavlink_signing.key
|
|
ONBOARD_TILE_SIGNING_KEY_PATH=/fixtures/keys/dev_onboard_signing.key
|
|
MOCK_FAILURE_PROFILE=none
|
|
```
|
|
|
|
`.env.example.operator-workstation`:
|
|
|
|
```env
|
|
# operator workstation
|
|
SATELLITE_PROVIDER_URL=http://localhost:5100 # local mirror, or replace with lab VPN URL
|
|
SATELLITE_PROVIDER_API_KEY= # populate from the workstation secret manager; NEVER commit
|
|
COMPANION_DB_URL= # set when companion is plugged in for FDR retrieval
|
|
OPERATOR_TOOLING_BIND_ADDR=127.0.0.1:8080
|
|
```
|
|
|
|
### Variable validation
|
|
|
|
The runtime composition root (`src/composition/runtime_root.py`, ADR-009) validates every required variable at startup and fails fast with a clear error message. Specifically:
|
|
|
|
- **Type validation** for enums (`GPS_DENIED_FC_PROFILE`, `GPS_DENIED_VIO_STRATEGY`, etc.) against the strategies linked into the binary (ADR-002 enforcement at config layer).
|
|
- **Path validation** for every `*_PATH` variable: file must exist + (where applicable) content-hash must match `manifests` table entry.
|
|
- **Forbidden-pair validation**: `GPS_DENIED_BUILD_KIND=deployment` AND `GPS_DENIED_VIO_STRATEGY=vins_mono` is rejected at startup ("vins_mono is not linked into the deployment binary"). The same check is repeated for any research-only strategy.
|
|
- **Production hardening**: when `LOG_LEVEL=DEBUG` is set on a binary built with `GPS_DENIED_BUILD_KIND=deployment` AND a manifest indicates a production deployment, the runtime emits a warning and downgrades to `INFO`. A flag `GPS_DENIED_ALLOW_DEBUG_IN_PROD=1` is required to override (only set when an engineer is debugging a returned-from-flight unit on the bench).
|
|
|
|
## Secrets Management
|
|
|
|
The threat model (architecture.md § 7) treats the airborne companion as a **remote untrusted endpoint**: a downed UAV's companion can be physically captured. Persistent secrets must therefore be **per-flight ephemeral** wherever feasible.
|
|
|
|
| Environment | Mechanism | Tool |
|
|
|-------------|--------|------|
|
|
| `dev-tier1` | `.env` file (git-ignored) + dev keys (committed test fixtures, clearly marked) | dotenv |
|
|
| `dev-tier2` | `.env` file (git-ignored) + dev keys | dotenv |
|
|
| `staging-tier1` | GitHub Actions secrets | GitHub-managed |
|
|
| `staging-tier2` | GitHub Actions secrets injected onto the self-hosted Jetson runner | GitHub-managed |
|
|
| `production` (companion) | **Per-flight ephemeral keys** generated at takeoff load by the takeoff bring-up sequence (C8 signing handshake + per-flight tile signing key seed); written to `/var/lib/gps-denied/per-flight/`; logged to FDR; deleted on flight-ring rollover (≥ 30 days post-landing default) | Local filesystem; no external secret manager |
|
|
| `production-operator-workstation` | OS-level secret store (keyring / GNOME secrets / macOS keychain) for the long-lived `SATELLITE_PROVIDER_API_KEY` | OS keyring + workstation hardening |
|
|
|
|
### Per-flight key lifecycle (production companion)
|
|
|
|
1. **Pre-flight**: operator stages cache + calibration + manifests. NO secrets are baked into the JetPack image — the image is identical across all UAVs the operator deploys.
|
|
2. **Takeoff load (F2)**: the takeoff sequence generates two ephemeral keypairs:
|
|
- MAVLink-2.0 per-flight signing key (D-C8-9 = (d), driven by C8) — only used on the AP wired channel; iNav has no signing.
|
|
- Onboard tile-signing keypair (D-PROJ-2 design task #1 contract) — used to sign every mid-flight tile so the parent suite's planned voting layer can authenticate the source.
|
|
3. **In flight**: keys live at `/var/lib/gps-denied/per-flight/*.key` (mode 0600, owned by the runtime UID). The MAVLink signing key fingerprint is logged to FDR record `MavlinkSigningKeyRotated`; the onboard signing pubkey hash is recorded in the `flights` table.
|
|
4. **Post-landing**: the operator's C11 `TileUploader` uses the onboard tile-signing private key to assemble the upload payload; it's the only post-flight consumer.
|
|
5. **Rollover**: when the FDR ring drops a flight, the per-flight key files for that flight are deleted by the same atomic step.
|
|
|
|
### No long-lived secrets on the production companion image
|
|
|
|
| Type | Where it lives |
|
|
|---|---|
|
|
| `SATELLITE_PROVIDER_API_KEY` | Operator workstation only; never on the companion image (architecture.md § 7) |
|
|
| Per-flight MAVLink signing key | Generated on companion at takeoff; per-flight ephemeral |
|
|
| Per-flight onboard tile-signing key | Generated on companion at takeoff; per-flight ephemeral |
|
|
| Production deployment binary signing key | Release-time; lives only in the Tier-1 release secret manager |
|
|
| JetPack image signing key | Same as above |
|
|
|
|
This means the threat surface on a captured companion reduces to "what is in the FDR for the current flight" plus "the public keys of the upstream signing roots" — the latter is publishable without harm.
|
|
|
|
### Rotation policy
|
|
|
|
| Secret | Rotation cadence | Procedure |
|
|
|---|---|---|
|
|
| Per-flight MAVLink signing key | Every flight (per-flight ephemeral) | Automated at takeoff load |
|
|
| Per-flight onboard tile-signing key | Every flight (per-flight ephemeral) | Automated at takeoff load |
|
|
| `SATELLITE_PROVIDER_API_KEY` | Operator-managed; rotated when an operator workstation is reissued or compromised is suspected | Operator workstation hardening procedure (out of scope of this document; operator-tooling C12 owns it) |
|
|
| Production binary signing key | Per release cycle or on suspected compromise | Release engineer rotates; new key fingerprint is published in release notes; verification scripts on the operator workstation pull the latest fingerprint |
|
|
| JetPack image signing key | Same as production binary signing key | Same |
|
|
|
|
## Database Management
|
|
|
|
Each companion has its **own local PostgreSQL 16** instance — no shared upstream database, no cluster, no replication. The data_model.md § 1 makes this explicit: companion DB is per-companion; cross-companion coordination happens via `satellite-provider` post-landing only.
|
|
|
|
| Environment | Type | Migrations | Data |
|
|
|-------------|------|-----------|------|
|
|
| `dev-tier1` | Docker `postgres:16-alpine`, named volume | Applied on container start by an init script; Alembic-managed (data_model.md § 4) | Seed data via `tests/fixtures/seed-db.sql` |
|
|
| `dev-tier2` | PostgreSQL 16 native on the Jetson (or via developer-installed deb packages) | Applied via `alembic upgrade head` invoked by the takeoff-load script | Same seed fixtures |
|
|
| `staging-tier1` | Docker `postgres:16-alpine` | Applied by the test runner before scenarios start | Sealed fixture rows |
|
|
| `staging-tier2` | PostgreSQL 16 on the Jetson runner | Applied by the test runner | Sealed fixture rows + per-scenario synthetic injections (NFT-SEC-01 cache-poisoning Monte Carlo, etc.) |
|
|
| `production` | PostgreSQL 16 on the Jetson, native install (part of the JetPack image) | Applied at JetPack image build time by the image builder; companion runtime asserts `alembic current == head` at takeoff load and refuses takeoff on mismatch | Live data only (data_model.md § 5 hard rule: production NEVER seeds) |
|
|
| `production-operator-workstation` | Workstation's local `satellite-provider` mirror has its own DB; operator tooling does NOT run a separate DB | Mirror DB is `satellite-provider`'s concern; operator tooling reads it but does not migrate it | Mirror data |
|
|
|
|
### Migration rules (data_model.md § 4 + § 6)
|
|
|
|
- All migrations must be **additive-only by default** (data_model.md § 6.1).
|
|
- All migrations must be **reversible by default** (data_model.md § 4.2). Non-reversible migrations require an ADR + user sign-off.
|
|
- The `tiles` schema specifically has its **canonical columns frozen** (data_model.md § 6.3) — coordinate any change with `satellite-provider`'s schema owner.
|
|
- Production migrations are applied at JetPack image build time, not at runtime. The companion never invokes `alembic upgrade` against a live database in flight; it only verifies `alembic current == head`.
|
|
- Migration scripts are reviewed in the same PR that adds the schema change; a PR-level checklist line in the PR template references this rule.
|
|
|
|
## Configuration Loading Order
|
|
|
|
Composition root (`src/composition/runtime_root.py`) loads configuration in this strict order — later sources override earlier ones:
|
|
|
|
1. `_docs/02_document/runtime_config_defaults.yaml` (project-wide defaults; committed)
|
|
2. `/etc/gps-denied/runtime.yaml` (per-airframe overrides; baked into the JetPack image)
|
|
3. Environment variables (highest precedence on production; second-highest in dev where the next item exists)
|
|
4. `--config-override KEY=VALUE` CLI flags (developer convenience; rejected on production by the manifest validator)
|
|
|
|
The full resolved configuration is logged to FDR as a `ComponentLifecycleEvent` of type `runtime_config_resolved` at takeoff load — this is the audit record for "what config did this flight actually run with".
|