mirror of
https://github.com/azaion/loader.git
synced 2026-04-22 22:16:32 +00:00
8f7deb3fca
Made-with: Cursor
160 lines
9.0 KiB
Markdown
160 lines
9.0 KiB
Markdown
# Azaion.Loader — Architecture
|
|
|
|
## 1. System Context
|
|
|
|
**Problem being solved**: Azaion's suite of AI/drone services ships as encrypted Docker images. Edge devices need a secure way to authenticate, download encryption keys, decrypt the image archive, and load it into Docker — plus an ongoing mechanism to download and upload encrypted model resources (split into small+big parts for security and CDN offloading).
|
|
|
|
**System boundaries**:
|
|
- **Inside**: FastAPI service handling auth, resource management, and Docker image unlock
|
|
- **Outside**: Azaion Resource API, S3-compatible CDN, Docker daemon, external HTTP clients
|
|
|
|
**External systems**:
|
|
|
|
| System | Integration Type | Direction | Purpose |
|
|
|----------------------|------------------|-----------|--------------------------------------------|
|
|
| Azaion Resource API | REST (HTTPS) | Both | Authentication, resource download/upload, key fragment retrieval |
|
|
| S3-compatible CDN | S3 API (boto3) | Both | Large resource part storage |
|
|
| Docker daemon | CLI (subprocess) | Outbound | Load decrypted image archives, inspect images |
|
|
| Host OS | CLI (subprocess) | Inbound | Hardware fingerprint collection |
|
|
|
|
## 2. Technology Stack
|
|
|
|
| Layer | Technology | Version | Rationale |
|
|
|------------|-------------------------|----------|-----------------------------------------------------------|
|
|
| Language | Python + Cython | 3.11 / 3.1.3 | Cython for IP protection (compiled .so) + performance |
|
|
| Framework | FastAPI + Uvicorn | latest | Async HTTP, auto-generated OpenAPI docs |
|
|
| Database | None | — | Stateless service; all persistence is external |
|
|
| Cache | In-memory (module globals)| — | JWT token, hardware fingerprint, CDN config |
|
|
| Message Queue | None | — | Synchronous request-response only |
|
|
| Container | Docker (python:3.11-slim)| — | Docker CLI installed inside container for `docker load` |
|
|
| CI/CD | Woodpecker CI | — | ARM64 Docker builds pushed to local registry |
|
|
|
|
**Key constraints**:
|
|
- Must run on ARM64 edge devices
|
|
- Requires Docker-in-Docker (Docker socket mount) for image loading
|
|
- Cython compilation at build time — `.pyx` files compiled to native extensions for IP protection
|
|
|
|
## 3. Deployment Model
|
|
|
|
**Environments**: Development (local), Production (edge devices)
|
|
|
|
**Infrastructure**:
|
|
- Containerized via Docker (single container)
|
|
- Runs on edge devices with Docker socket access
|
|
- No orchestration layer — standalone container
|
|
|
|
**Environment-specific configuration**:
|
|
|
|
| Config | Development | Production |
|
|
|-----------------|------------------------------|---------------------------------|
|
|
| RESOURCE_API_URL| `https://api.azaion.com` | `https://api.azaion.com` (same) |
|
|
| IMAGES_PATH | `/opt/azaion/images.enc` | `/opt/azaion/images.enc` |
|
|
| Secrets | Env vars / cdn.yaml | Env vars / cdn.yaml (encrypted) |
|
|
| Logging | stdout + stderr | File (Logs/) + stdout + stderr |
|
|
| Docker socket | Mounted from host | Mounted from host |
|
|
|
|
## 4. Data Model Overview
|
|
|
|
**Core entities**:
|
|
|
|
| Entity | Description | Owned By Component |
|
|
|---------------|--------------------------------------|--------------------|
|
|
| Credentials | Email + password pair | 01 Core Models |
|
|
| User | Authenticated user with role | 01 Core Models |
|
|
| RoleEnum | Authorization role hierarchy | 01 Core Models |
|
|
| UnlockState | State machine for unlock workflow | 01 Core Models |
|
|
| CDNCredentials| S3 endpoint + read/write key pairs | 03 Resource Mgmt |
|
|
|
|
**Key relationships**:
|
|
- Credentials → User: login produces a User from JWT claims
|
|
- Credentials → CDNCredentials: credentials enable downloading the encrypted cdn.yaml config
|
|
|
|
**Data flow summary**:
|
|
- Client → Loader → Resource API: authentication, encrypted resource download (small part)
|
|
- Client → Loader → CDN: large resource part upload/download
|
|
- Client → Loader → Docker: decrypted image archive loading
|
|
|
|
## 5. Integration Points
|
|
|
|
### Internal Communication
|
|
|
|
| From | To | Protocol | Pattern |
|
|
|----------------|---------------------|--------------|------------------|
|
|
| HTTP API (04) | Resource Mgmt (03) | Direct call | Request-Response |
|
|
| Resource Mgmt | Security (02) | Direct call | Request-Response |
|
|
| Resource Mgmt | Core Models (01) | Direct call | Read constants |
|
|
|
|
### External Integrations
|
|
|
|
| External System | Protocol | Auth | Rate Limits | Failure Mode |
|
|
|----------------------|--------------|----------------|-------------|----------------------------------|
|
|
| Azaion Resource API | REST/HTTPS | JWT Bearer | Unknown | Retry once on 401/403; raise on 500/409 |
|
|
| S3-compatible CDN | S3 API/HTTPS | Access key pair| Unknown | Return False, log error |
|
|
| Docker daemon | CLI/socket | Docker socket | — | Raise CalledProcessError |
|
|
|
|
## 6. Non-Functional Requirements
|
|
|
|
| Requirement | Target | Measurement | Priority |
|
|
|-----------------|-----------------|--------------------------|----------|
|
|
| Availability | Service uptime | `/health` endpoint | High |
|
|
| Latency (p95) | Varies by resource size | Per-request timing | Medium |
|
|
| Data retention | 30 days (logs) | Loguru rotation config | Low |
|
|
|
|
No explicit SLAs, throughput targets, or recovery objectives are defined in the codebase.
|
|
|
|
## 7. Security Architecture
|
|
|
|
**Authentication**: JWT Bearer tokens issued by Azaion Resource API. Tokens decoded without signature verification (trusts the API server).
|
|
|
|
**Authorization**: Role-based (RoleEnum: NONE → Operator → Validator → CompanionPC → Admin → ResourceUploader → ApiAdmin). Roles parsed from JWT but not enforced by Loader endpoints.
|
|
|
|
**Data protection**:
|
|
- At rest: AES-256-CBC encrypted resources on disk; Docker images stored as encrypted `.enc` archive
|
|
- In transit: HTTPS for API calls; S3 HTTPS for CDN
|
|
- Secrets management: CDN credentials stored in encrypted `cdn.yaml` downloaded from API; user credentials in memory only
|
|
|
|
**Key derivation**:
|
|
- Per-user/per-machine keys: `SHA-384(email + password + hardware_hash + salt)` → used for API resource downloads
|
|
- Shared resource key: `SHA-384(fixed_salt)` → used for big/small resource split encryption
|
|
- Hardware binding: `SHA-384("Azaion_" + hardware_fingerprint + salt)` → ties decryption to specific hardware
|
|
|
|
**Audit logging**: Application-level logging via Loguru (file + stdout/stderr). No structured audit trail.
|
|
|
|
## 8. Key Architectural Decisions
|
|
|
|
### ADR-001: Cython for IP Protection
|
|
|
|
**Context**: The loader handles encryption keys and security-sensitive logic that should not be trivially readable.
|
|
|
|
**Decision**: Core modules (api_client, security, cdn_manager, hardware_service, credentials, user, constants) are written in Cython and compiled to native `.so` extensions.
|
|
|
|
**Alternatives considered**:
|
|
1. Pure Python with obfuscation — rejected because obfuscation is reversible
|
|
2. Compiled language (Rust/Go) — rejected because of tighter integration needed with Python ecosystem (FastAPI, boto3)
|
|
|
|
**Consequences**: Build step required (`setup.py build_ext --inplace`); `cdef` methods not callable from pure Python; debugging compiled extensions is harder.
|
|
|
|
### ADR-002: Binary-Split Resource Scheme
|
|
|
|
**Context**: Large model files need secure distribution. Storing entire encrypted files on one server creates a single point of compromise.
|
|
|
|
**Decision**: Resources are encrypted, then split into a small part (uploaded to the authenticated API) and a large part (uploaded to CDN). Decryption requires both parts.
|
|
|
|
**Alternatives considered**:
|
|
1. Single encrypted download from API — rejected because of bandwidth/cost for large files
|
|
2. Unencrypted CDN with signed URLs — rejected because CDN compromise would expose models
|
|
|
|
**Consequences**: More complex download/upload logic; local caching of big parts for performance; CDN credentials managed separately from API credentials.
|
|
|
|
### ADR-003: Docker-in-Docker for Image Loading
|
|
|
|
**Context**: The loader needs to inject Docker images into the host Docker daemon on edge devices.
|
|
|
|
**Decision**: Mount Docker socket into the loader container; use Docker CLI (`docker load`, `docker image inspect`) via subprocess.
|
|
|
|
**Alternatives considered**:
|
|
1. Docker API via Python library — rejected because Docker CLI is simpler and universally available
|
|
2. Image loading outside the loader — rejected because the unlock workflow needs to be self-contained
|
|
|
|
**Consequences**: Container requires Docker socket mount (security implication); Docker CLI must be installed in the container image.
|