Files
loader/_docs/02_document/system-flows.md
T
Oleksandr Bezdieniezhnykh 8f7deb3fca Add E2E tests, fix bugs
Made-with: Cursor
2026-04-13 05:17:48 +03:00

296 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Azaion.Loader — System Flows
## Flow Inventory
| # | Flow Name | Trigger | Primary Components | Criticality |
|---|--------------------|----------------------------|-----------------------------|-------------|
| F1| Authentication | POST `/login` | 04 HTTP API, 03 Resource Mgmt | High |
| F2| Resource Download | POST `/load/{filename}` | 04, 03, 02 | High |
| F3| Resource Upload | POST `/upload/{filename}` | 04, 03, 02 | High |
| F4| Docker Unlock | POST `/unlock` | 04, 03 | High |
| F5| Unlock Status Poll | GET `/unlock/status` | 04 | Medium |
| F6| Health/Status | GET `/health`, `/status` | 04 | Low |
## Flow Dependencies
| Flow | Depends On | Shares Data With |
|------|--------------------------------|-------------------------------|
| F1 | — | F2, F3, F4 (via JWT token) |
| F2 | F1 (credentials must be set) | — |
| F3 | F1 (credentials must be set) | — |
| F4 | — (authenticates internally) | F5 (via unlock_state) |
| F5 | F4 (must be started) | — |
| F6 | — | F1 (reads auth state) |
---
## Flow F1: Authentication
### Description
Client sends email/password to set credentials on the API client singleton. This initializes the CDN manager by downloading and decrypting `cdn.yaml` from the Azaion Resource API.
### Preconditions
- Loader service is running
- Azaion Resource API is reachable
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant HTTPApi as HTTP API (main)
participant ApiClient as ApiClient
participant Security as Security
participant HW as HardwareService
participant ResourceAPI as Azaion Resource API
Client->>HTTPApi: POST /login {email, password}
HTTPApi->>ApiClient: set_credentials_from_dict(email, password)
ApiClient->>ApiClient: set_credentials(Credentials)
ApiClient->>ApiClient: login()
ApiClient->>ResourceAPI: POST /login {email, password}
ResourceAPI-->>ApiClient: {token: "jwt..."}
ApiClient->>ApiClient: set_token(jwt) → decode claims → create User
ApiClient->>HW: get_hardware_info()
HW-->>ApiClient: "CPU: ... GPU: ..."
ApiClient->>Security: get_hw_hash(hardware)
Security-->>ApiClient: hw_hash
ApiClient->>Security: get_api_encryption_key(creds, hw_hash)
Security-->>ApiClient: api_key
ApiClient->>ResourceAPI: POST /resources/get/ {cdn.yaml, encrypted}
ResourceAPI-->>ApiClient: encrypted bytes
ApiClient->>Security: decrypt_to(bytes, api_key)
Security-->>ApiClient: cdn.yaml content
ApiClient->>ApiClient: parse YAML → init CDNManager
HTTPApi-->>Client: {"status": "ok"}
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|--------------------|--------------------|--------------------|------------------------------|
| Invalid credentials| Resource API login | HTTPError (401/409)| Raise Exception → HTTP 401 |
| API unreachable | POST /login | ConnectionError | Raise Exception → HTTP 401 |
| CDN config decrypt | decrypt_to() | Crypto error | Raise Exception → HTTP 401 |
---
## Flow F2: Resource Download (Big/Small Split)
### Description
Client requests a resource by name. The loader downloads the small encrypted part from the API (per-user+hw key), retrieves the big part from local cache or CDN, concatenates them, and decrypts with the shared resource key.
### Preconditions
- Credentials set (F1 completed)
- Resource exists on API and CDN
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant HTTPApi as HTTP API
participant ApiClient as ApiClient
participant Security as Security
participant ResourceAPI as Azaion Resource API
participant CDN as S3 CDN
participant FS as Local Filesystem
Client->>HTTPApi: POST /load/{filename} {filename, folder}
HTTPApi->>ApiClient: load_big_small_resource(name, folder)
ApiClient->>ApiClient: load_bytes(name.small, folder)
ApiClient->>ResourceAPI: POST /resources/get/{folder} (encrypted)
ResourceAPI-->>ApiClient: encrypted small part
ApiClient->>Security: decrypt_to(small_bytes, api_key)
Security-->>ApiClient: decrypted small part
ApiClient->>Security: get_resource_encryption_key()
Security-->>ApiClient: shared_key
alt Local big part exists
ApiClient->>FS: read folder/name.big
FS-->>ApiClient: local_big_bytes
ApiClient->>Security: decrypt_to(small + local_big, shared_key)
Security-->>ApiClient: plaintext resource
else Local not found or decrypt fails
ApiClient->>CDN: download(folder, name.big)
CDN-->>ApiClient: remote_big_bytes
ApiClient->>Security: decrypt_to(small + remote_big, shared_key)
Security-->>ApiClient: plaintext resource
end
HTTPApi-->>Client: binary response (octet-stream)
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|----------------------|-------------------|-----------------|----------------------------------|
| Token expired | request() | 401/403 | Auto re-login, retry once |
| CDN download fail | cdn_manager | Exception | Raise to caller → HTTP 500 |
| Decrypt failure (local)| Security | Exception | Fall through to CDN download |
| API 500 | request() | Status code | Raise Exception → HTTP 500 |
---
## Flow F3: Resource Upload (Big/Small Split)
### Description
Client uploads a resource file. The loader encrypts it with the shared resource key, splits into small (≤3KB or 30%) and big parts, uploads small to the API and big to CDN + local cache.
### Preconditions
- Credentials set (F1 completed)
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant HTTPApi as HTTP API
participant ApiClient as ApiClient
participant Security as Security
participant ResourceAPI as Azaion Resource API
participant CDN as S3 CDN
participant FS as Local Filesystem
Client->>HTTPApi: POST /upload/{filename} (multipart: file + folder)
HTTPApi->>ApiClient: upload_big_small_resource(bytes, name, folder)
ApiClient->>Security: get_resource_encryption_key()
Security-->>ApiClient: shared_key
ApiClient->>Security: encrypt_to(resource, shared_key)
Security-->>ApiClient: encrypted_bytes
ApiClient->>ApiClient: split: small = min(3KB, 30%), big = rest
ApiClient->>CDN: upload(folder, name.big, big_bytes)
ApiClient->>FS: write folder/name.big (local cache)
ApiClient->>ApiClient: upload_file(name.small, small_bytes, folder)
ApiClient->>ResourceAPI: POST /resources/{folder} (multipart)
HTTPApi-->>Client: {"status": "ok"}
```
---
## Flow F4: Docker Image Unlock
### Description
Client triggers the unlock workflow with credentials. A background task authenticates, downloads a key fragment, decrypts the encrypted Docker image archive, and loads it into Docker.
### Preconditions
- Encrypted archive exists at `IMAGES_PATH`
- Docker daemon is accessible (socket mounted)
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant HTTPApi as HTTP API
participant BinarySplit as binary_split
participant ApiClient as ApiClient
participant ResourceAPI as Azaion Resource API
participant Docker as Docker CLI
Client->>HTTPApi: POST /unlock {email, password}
HTTPApi->>HTTPApi: check unlock_state (idle/error?)
HTTPApi->>HTTPApi: check IMAGES_PATH exists
HTTPApi->>HTTPApi: start background task
HTTPApi-->>Client: {"state": "authenticating"}
Note over HTTPApi: Background task (_run_unlock)
HTTPApi->>BinarySplit: check_images_loaded(version)
BinarySplit->>Docker: docker image inspect (×7 services)
alt Images already loaded
HTTPApi->>HTTPApi: unlock_state = ready
else Images not loaded
HTTPApi->>ApiClient: set_credentials + login()
ApiClient->>ResourceAPI: POST /login
ResourceAPI-->>ApiClient: JWT token
HTTPApi->>BinarySplit: download_key_fragment(url, token)
BinarySplit->>ResourceAPI: GET /binary-split/key-fragment
ResourceAPI-->>BinarySplit: key_fragment bytes
HTTPApi->>BinarySplit: decrypt_archive(images.enc, key, images.tar)
Note over BinarySplit: AES-256-CBC decrypt, strip padding
HTTPApi->>BinarySplit: docker_load(images.tar)
BinarySplit->>Docker: docker load -i images.tar
HTTPApi->>HTTPApi: remove tar, set unlock_state = ready
end
```
### Flowchart
```mermaid
flowchart TD
Start([POST /unlock]) --> CheckState{State is idle or error?}
CheckState -->|No| ReturnCurrent([Return current state])
CheckState -->|Yes| CheckArchive{Archive exists?}
CheckArchive -->|No| CheckLoaded{Images already loaded?}
CheckLoaded -->|Yes| SetReady([Set ready])
CheckLoaded -->|No| Error404([404: Archive not found])
CheckArchive -->|Yes| StartBG[Start background task]
StartBG --> BGCheck{Images already loaded?}
BGCheck -->|Yes| BGReady([Set ready])
BGCheck -->|No| Auth[Authenticate + login]
Auth --> DownloadKey[Download key fragment]
DownloadKey --> Decrypt[Decrypt archive]
Decrypt --> DockerLoad[docker load]
DockerLoad --> Cleanup[Remove tar]
Cleanup --> BGReady
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|--------------------|----------------------|----------------------|-----------------------------------|
| Archive missing | /unlock endpoint | os.path.exists check | 404 if images not already loaded |
| Auth failure | ApiClient.login() | HTTPError | unlock_state = error |
| Key download fail | download_key_fragment| HTTPError | unlock_state = error |
| Decrypt failure | decrypt_archive | Crypto/IO error | unlock_state = error |
| Docker load fail | docker_load | CalledProcessError | unlock_state = error |
| Tar cleanup fail | os.remove | OSError | Silently ignored |
---
## Flow F5: Unlock Status Poll
### Description
Client polls the unlock workflow progress. Returns current state and any error message.
### Preconditions
- F4 has been initiated (or state is idle)
### Data Flow
| Step | From | To | Data | Format |
|------|--------|--------|-------------------------------|--------|
| 1 | Client | HTTPApi| GET /unlock/status | — |
| 2 | HTTPApi| Client | {state, error} | JSON |
---
## Flow F6: Health & Status
### Description
Liveness probe (`/health`) returns static healthy. Status check (`/status`) returns auth state and model cache dir.
### Data Flow
| Step | From | To | Data | Format |
|------|--------|--------|----------------------------------------|--------|
| 1 | Client | HTTPApi| GET /health or /status | — |
| 2 | HTTPApi| Client | {status, authenticated?, modelCacheDir?}| JSON |