Files
loader/_docs/02_document/components/03_resource_management/description.md
T
Oleksandr Bezdieniezhnykh 8f7deb3fca Add E2E tests, fix bugs
Made-with: Cursor
2026-04-13 05:17:48 +03:00

7.3 KiB

Resource Management

1. High-Level Overview

Purpose: Orchestrates authenticated resource download/upload using a binary-split scheme (small encrypted part via API, large part via CDN), CDN storage operations, and Docker image archive decryption/loading.

Architectural Pattern: Facade — ApiClient coordinates CDN, Security, and API calls behind a unified interface.

Upstream dependencies: Core Models (01) — constants, Credentials, User, RoleEnum; Security (02) — encryption, key derivation, hardware fingerprinting

Downstream consumers: HTTP API (04) — main.py uses ApiClient for all resource operations and binary_split for Docker unlock

2. Internal Interfaces

Interface: ApiClient

Method Input Output Async Error Types
set_credentials_from_dict str email, str password No API errors, YAML parse errors
login No HTTPError, Exception
load_big_small_resource str resource_name, str folder bytes No Exception (API, CDN, decrypt)
upload_big_small_resource bytes resource, str resource_name, str folder No Exception (API, CDN, encrypt)
upload_to_cdn str bucket, str filename, bytes file_bytes No Exception
download_from_cdn str bucket, str filename bytes No Exception

Cython-only methods (cdef): set_credentials, set_token, get_user, request, list_files, check_resource, load_bytes, upload_file, load_big_file_cdn

Interface: CDNManager

Method Input Output Async Error Types
upload str bucket, str filename, bytes file_bytes bool No boto3 exceptions
download str folder, str filename bool No boto3 exceptions

Interface: binary_split (module-level functions)

Function Input Output Async Error Types
download_key_fragment str resource_api_url, str token bytes No requests.HTTPError
decrypt_archive str encrypted_path, bytes key_fragment, str output_path No crypto/IO errors
docker_load str tar_path No subprocess.CalledProcessError
check_images_loaded str version bool No

3. External API Specification

N/A — this component is consumed by HTTP API (04), not directly exposed.

4. Data Access Patterns

Caching Strategy

Data Cache Type TTL Invalidation
CDN config (cdn.yaml) In-memory (CDNManager) Process lifetime On re-authentication
JWT token In-memory Until 401/403 Auto-refresh on auth error
Big file parts Local filesystem Until version mismatch Overwritten on new upload

Storage Estimates

Location Description Growth Rate
{folder}/{name}.big Cached large resource parts Per resource upload
Logs/ Loguru log files ~daily rotation, 30-day retention

5. Implementation Details

State Management: ApiClient is a stateful singleton (token, credentials, CDN manager). binary_split is stateless.

Key Dependencies:

Library Version Purpose
requests 2.32.4 HTTP client for API calls
pyjwt 2.10.1 JWT token decoding (no verification)
boto3 1.40.9 S3-compatible CDN operations
pyyaml 6.0.2 CDN config parsing
cryptography 44.0.2 AES-256-CBC for archive decryption

Error Handling Strategy:

  • request() auto-retries on 401/403 (re-login then retry once)
  • 500 errors raise Exception with response text
  • 409 (Conflict) errors raise with parsed ErrorCode/Message
  • CDN operations return bool (True/False) — swallow exceptions, log error
  • binary_split functions propagate all errors to caller

Big/Small Resource Split Protocol:

  • Download: small part (encrypted per-user+hw key) from API + big part from local cache or CDN → concatenate → decrypt with shared resource key
  • Upload: encrypt entire resource with shared key → split at min(3KB, 30%) → small part to API, big part to CDN + local copy

6. Extensions and Helpers

None.

7. Caveats & Edge Cases

Known limitations:

  • JWT token decoded without signature verification — trusts the API server
  • CDN manager initialization requires a successful encrypted download (bootstrapping: credentials must already work for the login call that precedes CDN config download)
  • load_big_small_resource attempts local cache first; on decrypt failure (version mismatch), silently falls through to CDN download — the error is logged but not surfaced to caller
  • API_SERVICES list in binary_split is hardcoded — adding a new service requires code change
  • docker_load and check_images_loaded shell out to Docker CLI — requires Docker CLI in the container

Potential race conditions:

  • api_client singleton in main.py is initialized without locking; concurrent first requests could create multiple instances (only one is kept)

Performance bottlenecks:

  • Large resource encryption/decryption is synchronous and in-memory
  • CDN downloads are synchronous (blocking the thread)

8. Dependency Graph

Must be implemented after: Core Models (01), Security (02)

Can be implemented in parallel with: —

Blocks: HTTP API (04)

9. Logging Strategy

Log Level When Example
INFO File downloaded "Downloaded file: cdn.yaml, 1234 bytes"
INFO File uploaded "Uploaded model.bin to api.azaion.com/models successfully: 200."
INFO CDN operation "downloaded model.big from the models"
INFO Big file check "checking on existence for models/model.big"
ERROR Upload failure "Upload fail: ConnectionError(...)"
ERROR API error "{'ErrorCode': 409, 'Message': '...'}"

Log format: Via constants.log() / constants.logerror()

Log storage: Same as Core Models logging configuration