Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
11 KiB
GPS-Denied Onboard — Containerization
Date: 2026-05-09 (Plan Phase 2c — initial draft). Inputs:
_docs/02_document/architecture.md§ 3 (Deployment Model);_docs/00_problem/restrictions.md§ Onboard Hardware; ADR-002 (build-time exclusion of unused strategies); ADR-005 (Tier-1 / Tier-2 are first-class).
Containerization scope
This project has asymmetric containerization by design (architecture.md § 3, ADR-005):
- Tier-1 (workstation): Docker is the universal runtime. Dev, lint, unit, most integration, and
mock-suite-sat-serviceall run in Docker compose. - Tier-2 (Jetson): NO Docker. The deployed JetPack image runs the deployment binary natively. TensorRT INT8 calibration caches and
jetson-statsthermal telemetry are most reliable without a container layer (D-C7-9 + D-C10-6). The "image" is a JetPack 6.2 system image with the deployment binary preinstalled. - Operator workstation: Docker is used for the local
satellite-providermirror, themock-suite-sat-service(when offline), and the operator-tooling stack (C11 Tile Manager + C12 Operator Pre-flight Tooling).
Three Dockerfiles are maintained; the airborne companion uses none of them in production.
Component Dockerfiles
gps-denied-companion-tier1 (Tier-1 dev / CI only)
This image is for fast iterative development on a workstation. It is never flashed onto a Jetson.
| Property | Value |
|---|---|
| Base image | nvidia/cuda:12.6.0-runtime-ubuntu22.04 (or python:3.10-slim if no GPU on dev box) |
| Build image | nvidia/cuda:12.6.0-devel-ubuntu22.04 |
| Stages | system-deps → python-deps → cpp-build (CMake + GTSAM + FAISS + OpenCV + OKVIS2 + KltRansac) → runtime |
| User | companion (UID 1000, non-root) |
| Health check | python -m gps_denied.healthcheck (validates calibration JSON loadable + DB reachable + FAISS index mmap-able). 30 s interval. |
| Exposed ports | 5101/tcp (companion control plane — Tier-1 only; Tier-2 production has no inbound network) |
| Key build args | BUILD_VINS_MONO=OFF (deployment build), BUILD_SALAD=OFF; BUILD_VINS_MONO=ON BUILD_SALAD=ON for the research build |
| Notes | Two distinct image tags built on every PR: companion-tier1:deployment-<sha> and companion-tier1:research-<sha> (ADR-002). |
mock-suite-sat-service (Tier-1 e2e-test fixture; ADR-007 reversed 2026-05-09 — fixture only, not a component)
e2e-test fixture only — implements the planned D-PROJ-2 ingest contract (POST /api/satellite/tiles/ingest) so upload integration tests can run before the real endpoint ships service-side. Production never reaches it; the architectural counterparty for upload is the real satellite-provider. Download integration tests target the real satellite-provider directly (its GET surface is already implemented), not this fixture. Source lives under tests/fixtures/mock-suite-sat-service/, NOT src/components/.
| Property | Value |
|---|---|
| Base image | mcr.microsoft.com/dotnet/aspnet:8.0-alpine (matches the parent suite's stack) |
| Build image | mcr.microsoft.com/dotnet/sdk:8.0-alpine |
| Stages | restore → build → publish → runtime |
| User | mock (non-root) |
| Health check | HTTP GET /healthz (returns 200 if listening + storage backend mounted). 10 s interval. |
| Exposed ports | 5100/tcp (matches satellite-provider's port so the same client config works) |
| Key build args | MOCK_FAILURE_PROFILE (default none; used by NFT-SEC-01 to inject latency / 5xx / partial responses) |
| Notes | The mock is a release artifact (operator-tooling tarball includes its compose file). When the real satellite-provider D-PROJ-2 endpoint ships, the mock is retired. |
operator-tooling (Operator workstation Tile Manager + pre-flight UI, C11 + C12)
| Property | Value |
|---|---|
| Base image | python:3.10-slim |
| Build image | python:3.10-slim (no native deps; pure Python plus httpx for both download and upload, psycopg for read/write of C6 mirror, cryptography for upload signing) |
| Stages | python-deps → runtime |
| User | operator (non-root) |
| Health check | python -m operator_tooling.healthcheck (validates satellite-provider reachable). 30 s interval. |
| Exposed ports | 8080/tcp (operator pre-flight UI, C12); no inbound network for C11 Tile Manager (it's a CLI / one-shot tool, both directions) |
| Key build args | INCLUDE_PRE_FLIGHT_UI=true (default; can be turned off for headless CLI-only deployments) |
| Notes | C11 Tile Manager (both TileDownloader and TileUploader) is in this image, NEVER in gps-denied-companion-tier1 (ADR-004 process-level isolation). The airborne deployment binary on Tier-2 also does not contain C11. |
Docker Compose — Local Development
# docker-compose.yml
services:
companion:
build:
context: .
dockerfile: docker/companion-tier1.Dockerfile
args:
BUILD_VINS_MONO: "OFF"
BUILD_SALAD: "OFF"
image: gps-denied/companion-tier1:dev
environment:
- DB_URL=postgresql://gps_denied:dev@db:5432/gps_denied
- SATELLITE_PROVIDER_URL=http://mock-sat:5100
- CAMERA_CALIBRATION_PATH=/fixtures/calibration/adti26.json
- LOG_LEVEL=DEBUG
- GPS_DENIED_FC_PROFILE=ardupilot_plane
volumes:
- ./tests/fixtures:/fixtures:ro
- tile-cache:/var/lib/gps-denied/tiles
- fdr:/var/lib/gps-denied/fdr
depends_on:
db: { condition: service_healthy }
mock-sat: { condition: service_healthy }
healthcheck:
test: ["CMD", "python", "-m", "gps_denied.healthcheck"]
interval: 30s
timeout: 10s
retries: 3
networks: [ gps-denied-net ]
mock-sat:
build:
context: ./mock-suite-sat-service
dockerfile: Dockerfile
image: gps-denied/mock-suite-sat-service:dev
environment:
- ASPNETCORE_URLS=http://+:5100
- MOCK_FAILURE_PROFILE=none
volumes:
- mock-sat-tiles:/srv/tiles
healthcheck:
test: ["CMD", "wget", "-q", "-O-", "http://localhost:5100/healthz"]
interval: 10s
networks: [ gps-denied-net ]
db:
image: postgres:16-alpine
environment:
- POSTGRES_DB=gps_denied
- POSTGRES_USER=gps_denied
- POSTGRES_PASSWORD=dev
volumes:
- db-data:/var/lib/postgresql/data
- ./docker/db-init:/docker-entrypoint-initdb.d:ro
healthcheck:
test: ["CMD", "pg_isready", "-U", "gps_denied"]
interval: 5s
networks: [ gps-denied-net ]
operator-tooling:
build:
context: .
dockerfile: docker/operator-tooling.Dockerfile
image: gps-denied/operator-tooling:dev
environment:
- SATELLITE_PROVIDER_URL=http://mock-sat:5100
- COMPANION_DB_URL=postgresql://gps_denied:dev@db:5432/gps_denied
ports:
- "8080:8080"
depends_on:
mock-sat: { condition: service_healthy }
networks: [ gps-denied-net ]
volumes:
tile-cache:
fdr:
db-data:
mock-sat-tiles:
networks:
gps-denied-net:
Docker Compose — Tier-1 Integration & Blackbox Tests
# docker-compose.test.yml
services:
companion:
extends:
file: docker-compose.yml
service: companion
environment:
- LOG_LEVEL=INFO
- GPS_DENIED_REPLAY_FIXTURE=/fixtures/flight_derkachi
- GPS_DENIED_TIER=1
mock-sat:
extends:
file: docker-compose.yml
service: mock-sat
volumes:
- ./tests/fixtures/tiles_corpus:/srv/tiles:ro
db:
extends:
file: docker-compose.yml
service: db
volumes:
- ./tests/fixtures/seed-db.sql:/docker-entrypoint-initdb.d/01_seed.sql:ro
e2e-runner:
build:
context: ./e2e
dockerfile: Dockerfile
image: gps-denied/e2e-runner:dev
depends_on:
companion: { condition: service_healthy }
mock-sat: { condition: service_healthy }
db: { condition: service_healthy }
environment:
- PYTEST_ARGS=--csv=/results/report.csv -v
volumes:
- ./e2e/results:/results
Run: docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build.
Tier-2 — Jetson runtime (NO Docker)
The Tier-2 deployment is a JetPack 6.2 system image, not a container. Its assembly is documented in deployment_procedures.md § Production Deployment. Key constraints driving the no-Docker decision (architecture.md § 3, D-C7-9 + D-C10-6):
- TensorRT INT8 calibration caches: most reliable when the SM/JetPack/TRT triple matches the host kernel exactly; container-host abstraction is a known source of drift.
jetson-statsthermal telemetry: needs root + sysfs access; runs cleanest on bare metal.- AC-NEW-1 cold-start budget (30 s p95): container start adds 1–2 s overhead the budget cannot afford.
- AC-NEW-3 FDR storage (≤ 64 GB): the FDR ring is mounted on the host's NVM directly; a container layer would either bind-mount (no benefit) or copy (defeats the storage guarantee).
Tier-2 CI runs the same deployment binary directly on the self-hosted Jetson runner, with no container shim.
Image Tagging Strategy
| Context | Tag Format | Example |
|---|---|---|
| CI build (deployment binary) | <registry>/gps-denied/companion-tier1:deployment-<git-sha> |
ghcr.io/azaion/gps-denied/companion-tier1:deployment-a1b2c3d |
| CI build (research binary) | <registry>/gps-denied/companion-tier1:research-<git-sha> |
ghcr.io/azaion/gps-denied/companion-tier1:research-a1b2c3d |
| Mock sat service | <registry>/gps-denied/mock-suite-sat-service:<git-sha> |
ghcr.io/azaion/gps-denied/mock-suite-sat-service:a1b2c3d |
| Operator tooling | <registry>/gps-denied/operator-tooling:<git-sha> |
ghcr.io/azaion/gps-denied/operator-tooling:a1b2c3d |
| Release | <registry>/gps-denied/<image>:<semver> |
ghcr.io/azaion/gps-denied/companion-tier1:deployment-1.2.0 |
| Local dev | gps-denied/<image>:dev |
gps-denied/companion-tier1:dev |
| JetPack image (Tier-2) | gps-denied-jetpack-<semver>-<sha>.img |
gps-denied-jetpack-1.2.0-a1b2c3d.img (file artifact, not a container tag) |
SBOM and binary track
CI emits both Tier-1 binary tracks on every PR (ADR-002). After build, an SBOM diff step asserts:
- The deployment-binary SBOM must NOT include
vins_mono,salad, or any other research-only library. - The research-binary SBOM must include every strategy listed in the architecture.
A failing SBOM diff fails the PR. SBOM artifacts are attached to the release; they are NOT shipped on the deployed Jetson image (they live only in the release artifacts directory).
.dockerignore
.git
.cursor
_docs
_standalone
node_modules
**/bin
**/obj
**/__pycache__
**/.venv
**/venv
**/.pytest_cache
**/.mypy_cache
*.md
.env*
docker-compose*.yml
tests/fixtures/large_replays/
The tests/fixtures/large_replays/ exclusion is critical: that directory holds the Derkachi flight footage (multi-GB) which is mounted into the test runner via volumes: rather than baked into images.