Files
Oleksandr Bezdieniezhnykh bc40ea7300 [AZ-626] Decompose complete: 47 tasks + docs + module layout
Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy
Qt/C++ to a Rust workspace.

- Remove legacy Qt/C++ tree (ai_controller, drone_controller,
  misc/camera, python_scaffold, root Dockerfile, autopilot.pro,
  legacy main.py / requirements.txt).
- Add _docs/00_problem (problem, restrictions, acceptance criteria,
  security approach, input data + fixtures).
- Add _docs/01_solution/solution_draft01.
- Add _docs/02_document (architecture, system-flows, data_model,
  glossary, decision-rationale, deployment, 13 component descriptions,
  tests/ specs, FINAL_report, module-layout).
- Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one
  bootstrap + 46 component tasks) and _dependencies_table.md.
- Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for
  canonical _docs artifacts).
- Track autodev state in _docs/_autodev_state.md (Step 6 completed,
  ready for Step 7 Implement).

Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks
AZ-640..AZ-686. Total complexity 173 points across 12 epics.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-19 11:02:01 +03:00

7.1 KiB

Containerisation

Status: forward-looking design (Rust). Final shape will surface during build-system bring-up; treat the choices below as the current intent, not commitments.

1. Deployment shape

autopilot is a single Rust binary. Two delivery options are considered:

Option Form Pros Cons
A — native systemd unit bare binary deployed to /usr/local/bin/autopilot + a .service unit minimum overhead on Jetson; closest to airframe constraints; trivial flight-gate integration per-host installation discipline; less reproducible across nodes
B — single container image azaion/autopilot:<branch>-arm64 consistent across environments; matches the suite's existing OTA model (Watchtower) container runtime adds startup latency and one more failure surface on the airframe

The decision is Option A for the on-airframe deployment (lowest overhead, closest to the autopilot's real-time constraints), and Option B for development / CI / emulated-hardware testing (reproducibility wins). The same Rust binary is built once and packaged into both.

2. Target hardware

Item Value
Edge device NVIDIA Jetson Orin Nano Super 8 GB
Architecture aarch64
OS Ubuntu 22.04 (JetPack-bundled) — locked JetPack version + power mode
Camera ViewPro A40 (RTSP + UDP control)
Autopilot ArduPilot or PX4 over MAVLink v2 (UDP or serial)

3. Native deployment (Option A — production)

Layout:

/usr/local/bin/autopilot                  Rust binary
/etc/azaion/autopilot/config.toml         runtime config
/etc/systemd/system/autopilot.service     systemd unit
/var/lib/autopilot/                       persistent state (mapobjects_store)
/run/azaion/in-flight                     flight-gate marker (per ../_docs/00_top_level_architecture.md)

systemd unit highlights:

  • Type=notify — autopilot signals readiness once Tier 1, gimbal, and MAVLink links are healthy.
  • Restart=on-failure, RestartSec=2s, StartLimitBurst=5 — bounded restart (so a hard-broken binary doesn't loop forever).
  • MemoryMax= — enforces the on-airframe memory budget (~6 GB; Tier-1 YOLO container holds ~2 GB).
  • LimitNOFILE, LimitNPROC set explicitly.
  • ExecStartPre=/bin/sh -c 'mkdir -p /run/azaion && touch /run/azaion/in-flight' — asserts the suite-wide flight-gate so model-sync.service does not pull a new model mid-flight.
  • ExecStopPost=/bin/rm -f /run/azaion/in-flight — clears the flight-gate on shutdown.

Runtime config (/etc/azaion/autopilot/config.toml) is the single source for non-secret configuration: RTSP URL, gimbal endpoint, MAVLink connection URI, missions API endpoint, Ground Station endpoint, VLM IPC socket path, vlm_enabled flag, log level. Secrets (if any — TBD per ../_docs/02_missions.md auth model) come from the systemd EnvironmentFile= pointing at a permission-restricted file.

4. Container image (Option B — dev / CI / emulation)

Base image: nvcr.io/nvidia/l4t-base:<JetPack-pinned-tag> for production-equivalent NVDEC + TensorRT plumbing; ubuntu:22.04 for emulation (no GPU acceleration).

Image layout:

/usr/local/bin/autopilot                  Rust binary (built outside the image)
/etc/azaion/autopilot/config.toml         runtime config (mounted at runtime)
/var/lib/autopilot/                       persistent state (volume-mounted)

Image is non-root. Default USER is autopilot:autopilot; /var/lib/autopilot/ is owned by that user.

Compose example (development):

services:
  autopilot:
    image: azaion/autopilot:dev-arm64
    restart: unless-stopped
    environment:
      AUTOPILOT_CONFIG: /etc/azaion/autopilot/config.toml
    volumes:
      - ./config/autopilot.toml:/etc/azaion/autopilot/config.toml:ro
      - autopilot-state:/var/lib/autopilot
      - /run/azaion:/run/azaion
    devices:
      - /dev/ttyUSB0:/dev/ttyUSB0   # MAVLink serial (if used)
    network_mode: host              # RTSP / UDP gimbal / Ground Station modem all on host
volumes:
  autopilot-state: {}

network_mode: host is intentional on Jetson: RTSP, gimbal UDP, MAVLink UDP, and the modem-link to the Ground Station all share the airframe's network namespace.

5. External dependencies on the airframe

autopilot itself is the only autopilot-owned process. The on-airframe tier also runs (separately):

  • ../detections — Tier 1 YOLO service. Container delivered from its own pipeline. Bi-directional gRPC endpoint consumed by detection_client.
  • NanoLLM / VILA1.5-3B (optional) — local IPC peer of vlm_client. Separate container or process; not embedded in the autopilot binary. Surfaces a Unix-domain socket; peer-credential check is mandatory when supported.
  • GPS-Denied service — separate edge service, owned by gps-denied-onboard; consumed indirectly through the shared edge data path (per ../_docs/11_gps_denied.md).
  • model-sync.service — suite-wide rclone-driven model puller. Reads /run/azaion/in-flight to defer model swaps during flight (per ../_docs/00_top_level_architecture.md).

6. Configuration surface

All configuration is declarative (config.toml); there is no compile-time configuration of endpoints, addresses, or feature switches except the vlm_client build-time feature flag (see architecture.md §7.6 Local VLM confirmation > Optionality model).

Concern Mechanism
RTSP / gimbal / MAVLink endpoints config.toml
missions API endpoint + auth config.toml (auth pulled from EnvironmentFile=)
Ground Station endpoint config.toml
VLM IPC socket path config.toml
vlm_enabled runtime flag config.toml
vlm_client build-time feature cargo --features vlm at build
Log level + format RUST_LOG env (tracing-subscriber honours it)
Mission ID for the current flight CLI arg (per-flight, not per-host)

7. Health endpoint

autopilot exposes a single HTTP health endpoint (port and bind address from config.toml; default 127.0.0.1:8080). It aggregates per-component readiness:

{
  "status": "green | yellow | red",
  "components": {
    "frame_ingest":      "green",
    "detection_client":  "green",
    "movement_detector": "green",
    "semantic_analyzer": "green",
    "vlm_client":        "disabled",
    "scan_controller":   "green",
    "mapobjects_store":  "green",
    "gimbal_controller": "green",
    "operator_bridge":   "yellow",
    "mission_executor":  "green",
    "mavlink_layer":     "green",
    "mission_client":    "green",
    "telemetry_stream":  "green"
  },
  "last_state_change": "2026-05-17T12:00:00Z"
}

yellow is degraded-but-running; red is unrecoverable for at least one essential component. The aggregator surfaces details on each transition through tracing (see observability.md).

8. Out of scope here

  • Provisioning the Jetson host itself (Ansible / Kickstart / disk imaging) — owned by airframe ops.
  • Build pipeline (cross-compile, signing, registry push) — see ci_cd_pipeline.md.
  • Observability stack (tracing exporter, log shipper, metrics scraper) — see observability.md.
  • Mission delivery to the airframe — owned by missions API.