mirror of
https://github.com/azaion/autopilot.git
synced 2026-06-21 14:51:10 +00:00
bc40ea7300
Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy Qt/C++ to a Rust workspace. - Remove legacy Qt/C++ tree (ai_controller, drone_controller, misc/camera, python_scaffold, root Dockerfile, autopilot.pro, legacy main.py / requirements.txt). - Add _docs/00_problem (problem, restrictions, acceptance criteria, security approach, input data + fixtures). - Add _docs/01_solution/solution_draft01. - Add _docs/02_document (architecture, system-flows, data_model, glossary, decision-rationale, deployment, 13 component descriptions, tests/ specs, FINAL_report, module-layout). - Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one bootstrap + 46 component tasks) and _dependencies_table.md. - Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for canonical _docs artifacts). - Track autodev state in _docs/_autodev_state.md (Step 6 completed, ready for Step 7 Implement). Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks AZ-640..AZ-686. Total complexity 173 points across 12 epics. Co-authored-by: Cursor <cursoragent@cursor.com>
143 lines
7.1 KiB
Markdown
143 lines
7.1 KiB
Markdown
# Containerisation
|
|
|
|
**Status**: forward-looking design (Rust). Final shape will surface during build-system bring-up; treat the choices below as the current intent, not commitments.
|
|
|
|
## 1. Deployment shape
|
|
|
|
`autopilot` is a single Rust binary. Two delivery options are considered:
|
|
|
|
| Option | Form | Pros | Cons |
|
|
|---|---|---|---|
|
|
| **A — native systemd unit** | bare binary deployed to `/usr/local/bin/autopilot` + a `.service` unit | minimum overhead on Jetson; closest to airframe constraints; trivial flight-gate integration | per-host installation discipline; less reproducible across nodes |
|
|
| **B — single container image** | `azaion/autopilot:<branch>-arm64` | consistent across environments; matches the suite's existing OTA model (Watchtower) | container runtime adds startup latency and one more failure surface on the airframe |
|
|
|
|
The decision is **Option A** for the on-airframe deployment (lowest overhead, closest to the autopilot's real-time constraints), and **Option B** for development / CI / emulated-hardware testing (reproducibility wins). The same Rust binary is built once and packaged into both.
|
|
|
|
## 2. Target hardware
|
|
|
|
| Item | Value |
|
|
|---|---|
|
|
| Edge device | NVIDIA Jetson Orin Nano Super 8 GB |
|
|
| Architecture | aarch64 |
|
|
| OS | Ubuntu 22.04 (JetPack-bundled) — locked JetPack version + power mode |
|
|
| Camera | ViewPro A40 (RTSP + UDP control) |
|
|
| Autopilot | ArduPilot or PX4 over MAVLink v2 (UDP or serial) |
|
|
|
|
## 3. Native deployment (Option A — production)
|
|
|
|
**Layout:**
|
|
|
|
```text
|
|
/usr/local/bin/autopilot Rust binary
|
|
/etc/azaion/autopilot/config.toml runtime config
|
|
/etc/systemd/system/autopilot.service systemd unit
|
|
/var/lib/autopilot/ persistent state (mapobjects_store)
|
|
/run/azaion/in-flight flight-gate marker (per ../_docs/00_top_level_architecture.md)
|
|
```
|
|
|
|
**systemd unit highlights:**
|
|
|
|
- `Type=notify` — autopilot signals readiness once Tier 1, gimbal, and MAVLink links are healthy.
|
|
- `Restart=on-failure`, `RestartSec=2s`, `StartLimitBurst=5` — bounded restart (so a hard-broken binary doesn't loop forever).
|
|
- `MemoryMax=` — enforces the on-airframe memory budget (~6 GB; Tier-1 YOLO container holds ~2 GB).
|
|
- `LimitNOFILE`, `LimitNPROC` set explicitly.
|
|
- `ExecStartPre=/bin/sh -c 'mkdir -p /run/azaion && touch /run/azaion/in-flight'` — asserts the suite-wide flight-gate so `model-sync.service` does not pull a new model mid-flight.
|
|
- `ExecStopPost=/bin/rm -f /run/azaion/in-flight` — clears the flight-gate on shutdown.
|
|
|
|
**Runtime config** (`/etc/azaion/autopilot/config.toml`) is the single source for non-secret configuration: RTSP URL, gimbal endpoint, MAVLink connection URI, missions API endpoint, Ground Station endpoint, VLM IPC socket path, `vlm_enabled` flag, log level. Secrets (if any — TBD per `../_docs/02_missions.md` auth model) come from the systemd `EnvironmentFile=` pointing at a permission-restricted file.
|
|
|
|
## 4. Container image (Option B — dev / CI / emulation)
|
|
|
|
**Base image:** `nvcr.io/nvidia/l4t-base:<JetPack-pinned-tag>` for production-equivalent NVDEC + TensorRT plumbing; `ubuntu:22.04` for emulation (no GPU acceleration).
|
|
|
|
**Image layout:**
|
|
|
|
```text
|
|
/usr/local/bin/autopilot Rust binary (built outside the image)
|
|
/etc/azaion/autopilot/config.toml runtime config (mounted at runtime)
|
|
/var/lib/autopilot/ persistent state (volume-mounted)
|
|
```
|
|
|
|
**Image is non-root.** Default `USER` is `autopilot:autopilot`; `/var/lib/autopilot/` is owned by that user.
|
|
|
|
**Compose example** (development):
|
|
|
|
```yaml
|
|
services:
|
|
autopilot:
|
|
image: azaion/autopilot:dev-arm64
|
|
restart: unless-stopped
|
|
environment:
|
|
AUTOPILOT_CONFIG: /etc/azaion/autopilot/config.toml
|
|
volumes:
|
|
- ./config/autopilot.toml:/etc/azaion/autopilot/config.toml:ro
|
|
- autopilot-state:/var/lib/autopilot
|
|
- /run/azaion:/run/azaion
|
|
devices:
|
|
- /dev/ttyUSB0:/dev/ttyUSB0 # MAVLink serial (if used)
|
|
network_mode: host # RTSP / UDP gimbal / Ground Station modem all on host
|
|
volumes:
|
|
autopilot-state: {}
|
|
```
|
|
|
|
`network_mode: host` is intentional on Jetson: RTSP, gimbal UDP, MAVLink UDP, and the modem-link to the Ground Station all share the airframe's network namespace.
|
|
|
|
## 5. External dependencies on the airframe
|
|
|
|
`autopilot` itself is the only autopilot-owned process. The on-airframe tier also runs (separately):
|
|
|
|
- **`../detections`** — Tier 1 YOLO service. Container delivered from its own pipeline. Bi-directional gRPC endpoint consumed by `detection_client`.
|
|
- **NanoLLM / VILA1.5-3B** (optional) — local IPC peer of `vlm_client`. Separate container or process; not embedded in the autopilot binary. Surfaces a Unix-domain socket; peer-credential check is mandatory when supported.
|
|
- **GPS-Denied service** — separate edge service, owned by `gps-denied-onboard`; consumed indirectly through the shared edge data path (per `../_docs/11_gps_denied.md`).
|
|
- **`model-sync.service`** — suite-wide rclone-driven model puller. Reads `/run/azaion/in-flight` to defer model swaps during flight (per `../_docs/00_top_level_architecture.md`).
|
|
|
|
## 6. Configuration surface
|
|
|
|
All configuration is declarative (`config.toml`); there is no compile-time configuration of endpoints, addresses, or feature switches **except** the `vlm_client` build-time feature flag (see `architecture.md §7.6 Local VLM confirmation > Optionality model`).
|
|
|
|
| Concern | Mechanism |
|
|
|---|---|
|
|
| RTSP / gimbal / MAVLink endpoints | `config.toml` |
|
|
| `missions` API endpoint + auth | `config.toml` (auth pulled from `EnvironmentFile=`) |
|
|
| Ground Station endpoint | `config.toml` |
|
|
| VLM IPC socket path | `config.toml` |
|
|
| `vlm_enabled` runtime flag | `config.toml` |
|
|
| `vlm_client` build-time feature | `cargo --features vlm` at build |
|
|
| Log level + format | `RUST_LOG` env (`tracing-subscriber` honours it) |
|
|
| Mission ID for the current flight | CLI arg (per-flight, not per-host) |
|
|
|
|
## 7. Health endpoint
|
|
|
|
`autopilot` exposes a single HTTP health endpoint (port and bind address from `config.toml`; default `127.0.0.1:8080`). It aggregates per-component readiness:
|
|
|
|
```json
|
|
{
|
|
"status": "green | yellow | red",
|
|
"components": {
|
|
"frame_ingest": "green",
|
|
"detection_client": "green",
|
|
"movement_detector": "green",
|
|
"semantic_analyzer": "green",
|
|
"vlm_client": "disabled",
|
|
"scan_controller": "green",
|
|
"mapobjects_store": "green",
|
|
"gimbal_controller": "green",
|
|
"operator_bridge": "yellow",
|
|
"mission_executor": "green",
|
|
"mavlink_layer": "green",
|
|
"mission_client": "green",
|
|
"telemetry_stream": "green"
|
|
},
|
|
"last_state_change": "2026-05-17T12:00:00Z"
|
|
}
|
|
```
|
|
|
|
`yellow` is degraded-but-running; `red` is unrecoverable for at least one essential component. The aggregator surfaces details on each transition through `tracing` (see `observability.md`).
|
|
|
|
## 8. Out of scope here
|
|
|
|
- Provisioning the Jetson host itself (Ansible / Kickstart / disk imaging) — owned by airframe ops.
|
|
- Build pipeline (cross-compile, signing, registry push) — see `ci_cd_pipeline.md`.
|
|
- Observability stack (tracing exporter, log shipper, metrics scraper) — see `observability.md`.
|
|
- Mission delivery to the airframe — owned by `missions` API.
|