# Phase 4 — Configuration & Infrastructure Review **Review date**: 2026-05-19 **Scope**: Container build files, docker-compose topology, env templates, committed-secret hygiene, network policy, gitignore. **Files reviewed**: - `docker/companion-tier1.Dockerfile` - `docker/operator-orchestrator.Dockerfile` - `docker/mock-suite-sat-service.Dockerfile` - `e2e/runner/Dockerfile` - `e2e/fixtures/mock-suite-sat/Dockerfile` - `e2e/fixtures/tile-cache-builder/Dockerfile` - `e2e/docker/docker-compose.test.yml` - `e2e/docker/docker-compose.tier2-bridge.yml` - `e2e/docker/secrets/{README.md,mavlink_passkey}` - `e2e/fixtures/secrets/{README.md,mavlink-test-passkey.txt}` - `e2e/runner/requirements.txt` - `.env.example` - `.gitignore` - `scripts/run-tests.sh`, `scripts/run-tests-jetson.sh` ## Summary | Severity (this project) | Count | |---|---| | Critical | 0 | | High | 0 | | Medium | 4 | | Low | 5 | | Informational / positive observations | 7 | The closed-system threat model (no inbound listeners, no airborne network egress — see Phase 3 § A04) caps the blast radius of any container-hardening gap. The Medium-severity findings would all be raised to High in a multi-tenant or internet-exposed deployment; here they are Medium because the airborne / operator-workstation surface keeps an in-container attacker contained. ## Findings ### F14 — Production Dockerfiles run as `root` (no `USER` directive) **Severity (this project)**: Medium **Locations**: `docker/companion-tier1.Dockerfile` (entrypoint at line 55), `docker/operator-orchestrator.Dockerfile` (entrypoint at line 22). Neither production Dockerfile drops privileges before `ENTRYPOINT`. The Python runtime executes as UID 0 inside the container. **Project-specific context**: the SUT has no inbound network listener, so an external attacker has no direct path to in-container code execution. The risk is post-compromise: any RCE via a dependency vulnerability (e.g., the future-day equivalent of Phase 1's F1–F12) executes as root in the container, with write access to mounted volumes (`/var/azaion/fdr`, `/var/azaion/tile-cache:ro` — the read-only mount limits damage there, but `fdr-output` is RW). **Evidence the pattern is known to the project**: `e2e/fixtures/tile-cache-builder/Dockerfile:43-46` already implements the correct pattern: ```Dockerfile RUN useradd -u 10001 -m -d /home/builder builder \ && mkdir -p /input /output \ && chown -R builder:builder /opt/builder /input /output USER 10001:10001 ``` **Remediation**: replicate the same `useradd` + `chown` + `USER` block in both production Dockerfiles. Choose a stable UID (e.g., 10100 for the companion, 10200 for the orchestrator) and chown `/opt/gps-denied`, `/opt/venv`, `/var/azaion/fdr` accordingly. --- ### F15 — Production images install `[dev]` extras **Severity (this project)**: Medium **Locations**: `docker/companion-tier1.Dockerfile:27` (`pip install --no-cache-dir -e ".[dev]"`), `docker/operator-orchestrator.Dockerfile:14` (`pip install --no-cache-dir -e ".[dev]"`). The production runtime image ships with the `[dev]` extras: `pytest`, `pytest-asyncio`, `ruff`, `mypy`, `black`, `pytest-cov`, etc. This (a) ~doubles image size, (b) increases the attack surface inside the container (each test-only dep is a CVE candidate, and dev tools like `pytest` parse user-supplied files), and (c) muddies the dependency lockfile audit. **Project-specific context**: same closed-system bound as F14 — an attacker needs in-container execution first. But these packages substantially increase the count of in-process Python modules under control of an attacker. **Remediation**: define a runtime-only extras group in `pyproject.toml` (or rely on the base install with no extras) and use `pip install --no-cache-dir -e ".[runtime]"` or just `pip install --no-cache-dir -e .` in the production Dockerfile. Keep `[dev]` for developer environments and the e2e-runner only. --- ### F16 — Test-stack base images use moving / `latest` tags **Severity (this project)**: Medium (for `mavproxy:latest`), Low (for `ardupilot-plane-sitl:plane-stable`) **Locations**: - `e2e/docker/docker-compose.test.yml:41` — `ardupilot/ardupilot-sitl:plane-stable` - `e2e/docker/docker-compose.test.yml:67` — `ardupilot/mavproxy:latest` - `e2e/docker/docker-compose.test.yml:49` — `inavflight/inav-sitl:9.0.0` (this one IS pinned — good) **Project-specific context**: the test stack runs in `e2e-net.internal: true` (egress blocked), so a hostile image's network capability is neutered at the docker level. The remaining risk is build-reproducibility regression: a tagged-tomorrow release could break or change SITL behaviour silently between CI runs. **Remediation**: pin both to explicit versions (`mavproxy:1.8.55` style) or to SHA256 digest (`mavproxy@sha256:...`) — match the pattern at `e2e/fixtures/tile-cache-builder/Dockerfile:20` which uses a full SHA256 digest. --- ### F17 — Production Dockerfile base images use floating tags **Severity (this project)**: Low **Locations**: `docker/companion-tier1.Dockerfile:8,38` (`ubuntu:22.04`), `docker/operator-orchestrator.Dockerfile:4` (`python:3.10-slim`). These tags receive security-patch updates without explicit opt-in. That is intentionally desirable for OS patching, but it conflicts with bit-reproducible builds and the supply-chain audit goal. **Project-specific context**: Ubuntu LTS and `python:slim` are reasonable defaults; the failure mode is "two builds of the same commit hash produce different base layers", which complicates incident response (which `libc6` did the failing build ship?). **Remediation**: pin to SHA256 digest at release-tag time; bump explicitly on dependency-refresh cycles. Same pattern as `tile-cache-builder/Dockerfile:20`. --- ### F18 — Orphan / stale `docker/mock-suite-sat-service.Dockerfile` **Severity (this project)**: Low **Location**: `docker/mock-suite-sat-service.Dockerfile`. This file references `tests/fixtures/mock-suite-sat-service/` (path does NOT exist; the real fixture lives at `e2e/fixtures/mock-suite-sat/`), declares port 5100 + path `/healthz`, while the working build (`e2e/docker/docker-compose.test.yml:54 build: ../fixtures/mock-suite-sat`) uses port 8080 + path `/mock/health`. The `docker/`-side file is not referenced by any active compose target. **Project-specific context**: not a runtime vulnerability — orphan artifacts are dead code in the build system. The risk is operator confusion ("which Dockerfile does the mock build from?") and accidental future use of the broken file. **Remediation**: delete `docker/mock-suite-sat-service.Dockerfile`, OR fix it to be a thin wrapper around `e2e/fixtures/mock-suite-sat/Dockerfile`. (Project pattern: `docker/` should hold production-only Dockerfiles; test fixtures should live under `e2e/`.) --- ### F19 — Unused `curl` binary in production runtime image **Severity (this project)**: Low **Location**: `docker/operator-orchestrator.Dockerfile:9` (`curl` in the runtime apt-get install). Healthcheck uses `python3 -m gps_denied_onboard.healthcheck` (line 20), not curl. `curl` is a classic post-compromise tool (data exfil, second-stage payload fetch) and provides no runtime value. **Remediation**: remove `curl` from the runtime apt-get install line. --- ### F20 — Runner image `opencv-python>=4.12.0` has no upper bound **Severity (this project)**: Low **Location**: `e2e/runner/requirements.txt:25`. While the docstring at lines 4–6 correctly notes that the runner does not depend on `gtsam` (so the D-CROSS-CVE-1 numpy<2 ABI block doesn't apply), there is no upper bound — a future opencv 5.x release could ship a behaviour break that lands automatically on the next CI rebuild. **Remediation**: add an upper bound consistent with the rest of `requirements.txt` style: `opencv-python>=4.12.0,<5.0`. --- ### F21 — Stale path in `.env.example` **Severity (this project)**: Low **Location**: `.env.example:29` — `MAVLINK_SIGNING_KEY=tests/fixtures/mavlink_signing/dev_key`. That path predates the secrets reorganization that landed `e2e/fixtures/secrets/mavlink-test-passkey.txt` + `e2e/docker/secrets/mavlink_passkey`. Confusing for a new developer. **Remediation**: update to the current path conventions. Also note that the env var name itself (`MAVLINK_SIGNING_KEY`) is inconsistent with the production env var the docker-compose actually sets (`MAVLINK_SIGNING_PASSKEY_FILE`); align both. --- ### F22 — Production WORKDIR is not chowned **Severity (this project)**: Low (depends on whether F14 is fixed first) **Location**: `docker/companion-tier1.Dockerfile:50` (`WORKDIR /opt/gps-denied`), `docker/operator-orchestrator.Dockerfile:12`. If/when F14's non-root `USER` directive is added, the runtime user will not own `/opt/gps-denied` and will fail to write any artefact there (e.g., the tmpfs FDR pre-buffer). Today this is dormant because the container runs as root. Filing as a coupled remediation item to F14. **Remediation**: when adding the `USER` directive, also add `chown -R : /opt/gps-denied /opt/venv /var/azaion`. --- ## Positive Observations ### P5 — Test network is enforced as `internal: true` `e2e/docker/docker-compose.test.yml:117-124` declares `e2e-net.internal: true`. The SUT, mock, runner, and SITLs can talk to each other but none can reach the public internet. The e2e-runner verifies this at runtime by attempting a TCP connect to `1.1.1.1:443` (AC-5 of `NFT-SEC-02`). This is the docker-compose-layer counterpart to the production iptables / DNS blackhole (RESTRICT-OPS-1 / NFT-SEC-05). ### P6 — Committed test secrets are demonstrably synthetic Both committed secrets files (`e2e/docker/secrets/mavlink_passkey` and `e2e/fixtures/secrets/mavlink-test-passkey.txt`) contain the same canonical pattern `0123456789abcdef...` repeated, and both README files explicitly state "TEST ONLY — not for production use" with the production-side wiring documented. The `e2e/_unit_tests/test_directory_layout.py::test_passkey_files_match` assertion keeps the two files in lock-step (verified separately during the SUT review). No real secret is in version control. ### P7 — `e2e/runner/Dockerfile` follows the public-boundary contract The runner image: - Pins `python:3.12-slim-bookworm` (line 11) — explicit tag. - Uses `tini` as PID 1 (zombie reaping under `pytest --forked`). - Does NOT install the SUT package and explicitly excludes `src/` from `PYTHONPATH` (line 45 — `ENV PYTHONPATH=/opt/e2e-runner:/opt/e2e-runner/runner` only). - Sets `PYTHONDONTWRITEBYTECODE=1`, `PYTHONUNBUFFERED=1`, `PIP_NO_CACHE_DIR=1`, `PIP_DISABLE_PIP_VERSION_CHECK=1`. ### P8 — `e2e/fixtures/tile-cache-builder/Dockerfile` is gold-standard It pins Python to SHA256-digest (`python:3.10.14-slim-bookworm@sha256:...`), pins every Python dep with version bounds, drops to a numbered non-root user (`USER 10001:10001`), explicitly chowns the workdir, and sets `PYTHONHASHSEED=0` for reproducibility (line 24). This is the pattern the rest of the project should match. ### P9 — `.gitignore` covers secrets and build artefacts comprehensively `*.key`, `.env`, `.env.local` are blocked. The single explicit allow (`!tests/fixtures/mavlink_signing/dev_key`) is documented in the README. Build outputs (`.engine`, `.calib`, `.index`, `.faiss`, `.onnx`, `.trt`) are excluded. CMake artefacts (`build/`, `_skbuild/`, `compile_commands.json`) are excluded. ### P10 — Docker secrets are used for the test SUT (not env vars) `e2e/docker/docker-compose.test.yml:30-32` mounts the test mavlink passkey via Docker `secrets:` declaration (`mavlink_passkey` → `/run/secrets/mavlink_passkey`), not via the `environment:` block. The SUT reads from `MAVLINK_SIGNING_PASSKEY_FILE=/run/secrets/mavlink_passkey` — passkey content never crosses the container env. Production mirrors the same wiring (with a real secret store-mounted file). Correct pattern. ### P11 — Healthchecks defined on every service `gps-denied-onboard` (line 35), `mock-suite-sat-service` (line 61), and the production Dockerfiles themselves all declare HEALTHCHECK. `depends_on` uses `condition: service_healthy` for the SUT and mock (lines 106-109). ### P12 — `internal: true` AND no `ports:` block No production service in `docker-compose.test.yml` publishes a port to the host. The only host-reachable surface is via the `e2e-results` bind mount, which is a read-only artefact dropbox (line 142). Defense-in-depth on top of `internal: true`. ## Cross-Reference Index | Source | Phase 4 § | Note | |---|---|---| | `_docs/02_document/deployment/containerization.md` | F14, F15, F17, F22 | Docs the project's container conventions | | `_docs/02_document/deployment/environment_strategy.md` | F16, F21 | Docs env-var contract | | `_docs/02_document/tests/environment.md` § Communication with SUT | P10, F21 | Production passkey wiring | | `_docs/05_security/dependency_scan.md` | F15, F20 | Phase 1 deps audit (the dev extras shipping to production are part of Phase 1's surface) | | `_docs/02_document/tests/security-tests.md` § NFT-SEC-02 | P5 | The harness-side enforcement of the `internal: true` network | | `e2e/fixtures/tile-cache-builder/Dockerfile` | F14, F17 | Project's existing reference implementation of the pattern | ## Self-Verification - [x] All Dockerfiles in the repo scanned: 6 files (`docker/*.Dockerfile` × 3, `e2e/runner/Dockerfile`, `e2e/fixtures/*/Dockerfile` × 2) - [x] All docker-compose files scanned: 2 (`docker-compose.test.yml`, `docker-compose.tier2-bridge.yml`) - [x] All committed secret files inspected; content verified as synthetic test data - [x] `.gitignore` reviewed for secret-exclusion completeness - [x] `.env.example` reviewed for accidentally-committed credentials - [x] Findings cite file:line evidence - [x] Project-specific severity calibration applied (closed-system threat model recognized)