mirror of
https://github.com/azaion/loader.git
synced 2026-04-22 08:06:34 +00:00
Quality cleanup refactoring
Made-with: Cursor
This commit is contained in:
@@ -83,6 +83,7 @@ No tests exist. Coverage is 0% across all categories.
|
||||
| 3 | Should endpoint-level authorization be enforced? | Security — currently all endpoints accessible post-login | Team |
|
||||
| 4 | Should the resource encryption key be per-user instead of shared? | Security — currently all users share one key for big/small split | Team |
|
||||
| 5 | What are the target latency/throughput requirements? | Performance — no SLAs defined | Product |
|
||||
| 6 | Investigate replacing binary-split security with TPM on Jetson Orin Nano | Architecture — the binary-split model was designed for untrusted end-user laptops; SaaS/edge deployment on Jetson Orin Nano can use TPM instead, potentially simplifying the loader significantly | Team |
|
||||
|
||||
## Artifact Index
|
||||
|
||||
|
||||
@@ -17,7 +17,6 @@ Central API client that orchestrates authentication, encrypted resource download
|
||||
| token | str | JWT bearer token |
|
||||
| cdn_manager | CDNManager | CDN upload/download client |
|
||||
| api_url | str | Base URL for the resource API |
|
||||
| folder | str | Declared in `.pxd` but never assigned — dead attribute |
|
||||
|
||||
#### Methods
|
||||
|
||||
@@ -28,17 +27,12 @@ Central API client that orchestrates authentication, encrypted resource download
|
||||
| `set_credentials` | cdef | `(self, Credentials credentials)` | Internal: set credentials, lazy-init CDN manager |
|
||||
| `login` | cdef | `(self)` | POST `/login`, store JWT token |
|
||||
| `set_token` | cdef | `(self, str token)` | Decode JWT claims → create `User` with role mapping |
|
||||
| `get_user` | cdef | `(self) -> User` | Lazy login + return user |
|
||||
| `request` | cdef | `(self, str method, str url, object payload, bint is_stream)` | Authenticated HTTP request with auto-retry on 401/403 |
|
||||
| `list_files` | cdef | `(self, str folder, str search_file)` | GET `/resources/list/{folder}` with search param |
|
||||
| `check_resource` | cdef | `(self)` | POST `/resources/check` with hardware fingerprint |
|
||||
| `load_bytes` | cdef | `(self, str filename, str folder) -> bytes` | Download + decrypt resource using per-user+hw key |
|
||||
| `upload_file` | cdef | `(self, str filename, bytes resource, str folder)` | POST multipart upload to `/resources/{folder}` |
|
||||
| `upload_file` | cdef | `(self, str filename, bytes resource, str folder)` | POST multipart upload to `/resources/{folder}`; raises on HTTP error |
|
||||
| `load_big_file_cdn` | cdef | `(self, str folder, str big_part) -> bytes` | Download large file part from CDN |
|
||||
| `load_big_small_resource` | cpdef | `(self, str resource_name, str folder) -> bytes` | Reassemble resource from small (API) + big (CDN/local) parts |
|
||||
| `upload_big_small_resource` | cpdef | `(self, bytes resource, str resource_name, str folder)` | Split-encrypt and upload small part to API, big part to CDN |
|
||||
| `upload_to_cdn` | cpdef | `(self, str bucket, str filename, bytes file_bytes)` | Direct CDN upload |
|
||||
| `download_from_cdn` | cpdef | `(self, str bucket, str filename) -> bytes` | Direct CDN download |
|
||||
| `upload_big_small_resource` | cpdef | `(self, bytes resource, str resource_name, str folder)` | Split-encrypt; CDN upload must succeed or raises; then small part via `upload_file` |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
@@ -58,8 +52,8 @@ Central API client that orchestrates authentication, encrypted resource download
|
||||
### Big/Small Resource Split (upload)
|
||||
1. Encrypts entire resource with shared resource key
|
||||
2. Splits: small part = `min(SMALL_SIZE_KB * 1024, 30% of encrypted)`, big part = remainder
|
||||
3. Uploads big part to CDN + saves local copy
|
||||
4. Uploads small part to API via multipart POST
|
||||
3. Calls `cdn_manager.upload` for the big part; raises if upload fails
|
||||
4. Writes big part to local cache, then uploads small part to API via `upload_file` (non-2xx responses propagate)
|
||||
|
||||
### JWT Role Mapping
|
||||
Maps `role` claim string to `RoleEnum`: ApiAdmin, Admin, ResourceUploader, Validator, Operator, or NONE (default).
|
||||
@@ -89,7 +83,7 @@ The CDN config file is itself downloaded encrypted from the API on first credent
|
||||
|
||||
## External Integrations
|
||||
|
||||
- **Azaion Resource API**: `/login`, `/resources/get/{folder}`, `/resources/{folder}` (upload), `/resources/list/{folder}`, `/resources/check`
|
||||
- **Azaion Resource API**: `/login`, `/resources/get/{folder}`, `/resources/{folder}` (upload)
|
||||
- **S3 CDN**: via `CDNManager` for large file parts
|
||||
|
||||
## Security
|
||||
|
||||
@@ -11,7 +11,7 @@ Handles the encrypted Docker image archive workflow: downloading a key fragment
|
||||
| Function | Signature | Description |
|
||||
|------------------------|------------------------------------------------------------------------|----------------------------------------------------------|
|
||||
| `download_key_fragment`| `(resource_api_url: str, token: str) -> bytes` | GET request to `/binary-split/key-fragment` with Bearer auth |
|
||||
| `decrypt_archive` | `(encrypted_path: str, key_fragment: bytes, output_path: str) -> None` | AES-256-CBC decryption with SHA-256 derived key; strips PKCS7 padding |
|
||||
| `decrypt_archive` | `(encrypted_path: str, key_fragment: bytes, output_path: str) -> None` | AES-256-CBC stream decrypt with SHA-256 derived key; PKCS7 removed in-pipeline via unpadder |
|
||||
| `docker_load` | `(tar_path: str) -> None` | Runs `docker load -i <tar_path>` subprocess |
|
||||
| `check_images_loaded` | `(version: str) -> bool` | Checks all `API_SERVICES` images exist for given version tag |
|
||||
|
||||
@@ -26,9 +26,8 @@ Handles the encrypted Docker image archive workflow: downloading a key fragment
|
||||
### `decrypt_archive`
|
||||
1. Derives AES key: `SHA-256(key_fragment)` → 32-byte key
|
||||
2. Reads first 16 bytes as IV from encrypted file
|
||||
3. Decrypts remaining data in 64KB chunks using AES-256-CBC
|
||||
4. After decryption, reads last byte of output to determine PKCS7 padding length
|
||||
5. Truncates output file to remove padding
|
||||
3. Streams ciphertext in 64KB chunks through AES-256-CBC decryptor
|
||||
4. Feeds decrypted chunks through `padding.PKCS7(128).unpadder()`; writes unpadded bytes to the output file (`finalize` on decryptor and unpadder at end)
|
||||
|
||||
### `check_images_loaded`
|
||||
Iterates all 7 service image names, runs `docker image inspect <name>:<version>` for each. Returns `False` on first missing image.
|
||||
@@ -36,7 +35,7 @@ Iterates all 7 service image names, runs `docker image inspect <name>:<version>`
|
||||
## Dependencies
|
||||
|
||||
- **Internal**: none (leaf module)
|
||||
- **External**: `hashlib`, `os`, `subprocess` (stdlib), `requests` (2.32.4), `cryptography` (44.0.2)
|
||||
- **External**: `hashlib`, `subprocess` (stdlib), `requests` (2.32.4), `cryptography` (44.0.2)
|
||||
|
||||
## Consumers
|
||||
|
||||
|
||||
@@ -8,15 +8,10 @@ Centralizes shared configuration constants and provides the application-wide log
|
||||
|
||||
### Constants (cdef, module-level)
|
||||
|
||||
| Name | Type | Value |
|
||||
|------------------------|------|--------------------------------|
|
||||
| CONFIG_FILE | str | `"config.yaml"` |
|
||||
| QUEUE_CONFIG_FILENAME | str | `"secured-config.json"` |
|
||||
| AI_ONNX_MODEL_FILE | str | `"azaion.onnx"` |
|
||||
| CDN_CONFIG | str | `"cdn.yaml"` |
|
||||
| MODELS_FOLDER | str | `"models"` |
|
||||
| SMALL_SIZE_KB | int | `3` |
|
||||
| ALIGNMENT_WIDTH | int | `32` |
|
||||
| Name | Type | Value |
|
||||
|---------------|------|--------------|
|
||||
| CDN_CONFIG | str | `"cdn.yaml"` |
|
||||
| SMALL_SIZE_KB | int | `3` |
|
||||
|
||||
Note: `QUEUE_MAXSIZE`, `COMMANDS_QUEUE`, `ANNOTATIONS_QUEUE` are declared in the `.pxd` but not defined in the `.pyx` — they are unused in this codebase.
|
||||
|
||||
@@ -30,7 +25,7 @@ Note: `QUEUE_MAXSIZE`, `COMMANDS_QUEUE`, `ANNOTATIONS_QUEUE` are declared in the
|
||||
## Internal Logic
|
||||
|
||||
Loguru is configured with three sinks:
|
||||
- **File sink**: `Logs/log_loader_{date}.txt`, INFO level, daily rotation, 30-day retention, async (enqueue=True)
|
||||
- **File sink**: under `LOG_DIR`, path template `log_loader_{time:YYYYMMDD}.txt`, INFO level, daily rotation, 30-day retention, async (enqueue=True)
|
||||
- **Stdout sink**: DEBUG level, filtered to INFO/DEBUG/SUCCESS only, colorized
|
||||
- **Stderr sink**: WARNING+ level, colorized
|
||||
|
||||
@@ -39,7 +34,7 @@ Log format: `[HH:mm:ss LEVEL] message`
|
||||
## Dependencies
|
||||
|
||||
- **Internal**: none (leaf module)
|
||||
- **External**: `loguru` (0.7.3), `sys`, `time`
|
||||
- **External**: `loguru` (0.7.3), `os`, `sys`
|
||||
|
||||
## Consumers
|
||||
|
||||
@@ -53,7 +48,11 @@ None.
|
||||
|
||||
## Configuration
|
||||
|
||||
No env vars consumed directly. Log file path is hardcoded to `Logs/log_loader_{date}.txt`.
|
||||
| Env Variable | Default | Description |
|
||||
|--------------|---------|--------------------------------------|
|
||||
| LOG_DIR | `Logs` | Directory for daily log files |
|
||||
|
||||
The file sink uses Loguru’s `{time:YYYYMMDD}` in the filename under `LOG_DIR`.
|
||||
|
||||
## External Integrations
|
||||
|
||||
|
||||
@@ -33,29 +33,36 @@ FastAPI application entry point providing HTTP endpoints for health checks, auth
|
||||
|
||||
### Module-level State
|
||||
|
||||
| Name | Type | Description |
|
||||
|----------------|--------------------|------------------------------------------------|
|
||||
| api_client | ApiClient or None | Lazy-initialized singleton |
|
||||
| unlock_state | UnlockState | Current unlock workflow state |
|
||||
| unlock_error | Optional[str] | Last unlock error message |
|
||||
| unlock_lock | threading.Lock | Thread safety for unlock state mutations |
|
||||
| Name | Type | Description |
|
||||
|-------------------|-------------------------|----------------------------------------------------------------|
|
||||
| `_api_client` | `ApiClient` or `None` | Lazy-initialized singleton |
|
||||
| `_api_client_lock`| `threading.Lock` | Protects lazy initialization of `_api_client` (double-checked) |
|
||||
| `_unlock` | `_UnlockStateHolder` | Holds unlock workflow state and last error under an inner lock |
|
||||
|
||||
#### `_UnlockStateHolder`
|
||||
|
||||
| Member | Description |
|
||||
|-----------|-----------------------------------------------------------------------------|
|
||||
| `get()` | Returns `(state: UnlockState, error: Optional[str])` under lock |
|
||||
| `set(state, error=None)` | Sets state and optional error message under lock |
|
||||
| `state` (property) | Current `UnlockState` (read under lock) |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
### `get_api_client()`
|
||||
Lazy singleton pattern: creates `ApiClient(RESOURCE_API_URL)` on first call.
|
||||
Double-checked locking: if `_api_client` is `None`, acquires `_api_client_lock`, re-checks, then imports `ApiClient` and constructs `ApiClient(RESOURCE_API_URL)` once.
|
||||
|
||||
### Unlock Workflow (`_run_unlock`)
|
||||
Background task (via FastAPI BackgroundTasks) that runs these steps:
|
||||
1. Check if Docker images already loaded → if yes, set `ready`
|
||||
1. Check if Docker images already loaded → if yes, set `ready` (preserving any prior error from `get()`)
|
||||
2. Authenticate with API (login)
|
||||
3. Download key fragment from `/binary-split/key-fragment`
|
||||
4. Decrypt archive at `IMAGES_PATH` → `.tar`
|
||||
5. `docker load` the tar file
|
||||
6. Clean up tar file
|
||||
7. Set state to `ready` (or `error` on failure)
|
||||
6. Remove tar file; on `OSError`, log a warning and continue
|
||||
7. Set state to `ready` with no error (or `error` on failure)
|
||||
|
||||
State transitions are guarded by `unlock_lock` (threading.Lock).
|
||||
State and error are updated only through `_unlock.set()` and read via `_unlock.get()` / `_unlock.state`.
|
||||
|
||||
### `/unlock` Endpoint
|
||||
- If already `ready` → return immediately
|
||||
@@ -65,8 +72,8 @@ State transitions are guarded by `unlock_lock` (threading.Lock).
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **Internal**: `unlock_state` (UnlockState enum), `api_client` (lazy import), `binary_split` (lazy import)
|
||||
- **External**: `os`, `threading` (stdlib), `fastapi`, `pydantic`
|
||||
- **Internal**: `UnlockState` from `unlock_state`, `get_api_client()` (lazy `api_client` import), `binary_split` (lazy import in unlock paths)
|
||||
- **External**: `os`, `threading` (stdlib), `fastapi`, `pydantic`, `loguru` (logger for tar cleanup warnings)
|
||||
|
||||
## Consumers
|
||||
|
||||
@@ -94,7 +101,7 @@ None — this is the entry point module.
|
||||
|
||||
- Login endpoint returns 401 on auth failure
|
||||
- All resource endpoints use authenticated API client
|
||||
- Unlock state is thread-safe via `threading.Lock`
|
||||
- Unlock state and error are guarded by `_UnlockStateHolder`’s lock; API client initialization is guarded by `_api_client_lock`
|
||||
- Lazy imports of Cython modules (`api_client`, `binary_split`) to avoid import-time side effects
|
||||
|
||||
## Tests
|
||||
|
||||
@@ -15,7 +15,7 @@ All methods are `@staticmethod cdef` — Cython-only visibility, not callable fr
|
||||
| Method | Signature | Description |
|
||||
|-----------------------------|-----------------------------------------------------------------|----------------------------------------------------------------------|
|
||||
| `encrypt_to` | `(input_bytes, key) -> bytes` | AES-256-CBC encrypt with random IV, PKCS7 padding; returns `IV + ciphertext` |
|
||||
| `decrypt_to` | `(ciphertext_with_iv_bytes, key) -> bytes` | AES-256-CBC decrypt; first 16 bytes = IV; manual PKCS7 unpad |
|
||||
| `decrypt_to` | `(ciphertext_with_iv_bytes, key) -> bytes` | AES-256-CBC decrypt; first 16 bytes = IV; PKCS7 via `padding.PKCS7(128).unpadder()` |
|
||||
| `get_hw_hash` | `(str hardware) -> str` | Derives hardware hash: `SHA-384("Azaion_{hardware}_%$$$)0_")` → base64 |
|
||||
| `get_api_encryption_key` | `(Credentials creds, str hardware_hash) -> str` | Derives per-user+hw key: `SHA-384("{email}-{password}-{hw_hash}-#%@AzaionKey@%#---")` → base64 |
|
||||
| `get_resource_encryption_key`| `() -> str` | Returns fixed shared key: `SHA-384("-#%@AzaionKey@%#---234sdfklgvhjbnn")` → base64 |
|
||||
@@ -40,7 +40,7 @@ All methods are `@staticmethod cdef` — Cython-only visibility, not callable fr
|
||||
1. SHA-256 hash of string key → 32-byte AES key
|
||||
2. Split input: first 16 bytes = IV, rest = ciphertext
|
||||
3. AES-CBC decrypt
|
||||
4. Manual PKCS7 unpadding: read last byte as padding length; strip if 1–16
|
||||
4. PKCS7 removal via `cryptography` `padding.PKCS7(128).unpadder()` (`update` + `finalize`)
|
||||
|
||||
### Key Derivation Hierarchy
|
||||
- **Hardware hash**: salted hardware fingerprint → SHA-384 → base64
|
||||
|
||||
Reference in New Issue
Block a user