mirror of
https://github.com/azaion/ai-training.git
synced 2026-04-22 06:56:34 +00:00
142c6c4de8
- Replaced module-level path variables in constants.py with a structured Pydantic Config class. - Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure. - Fixed bugs related to image processing and model saving. - Enhanced test infrastructure to accommodate the new configuration approach. This refactor improves code maintainability and clarity by centralizing configuration management.
3.5 KiB
3.5 KiB
Component: API & CDN Client
Overview
Communication layer for the Azaion backend API and S3-compatible CDN. Handles authentication, encrypted file transfer, and the split-resource pattern for secure model distribution.
Pattern: Client library with split-storage resource management Upstream: Core (constants), Security (encryption, hardware identity) Downstream: Training, Inference, Exports
Modules
api_client— REST client for Azaion API, JWT auth, encrypted resource download/upload, split big/small patterncdn_manager— boto3 S3 client with separate read/write credentials
Internal Interfaces
CDNCredentials
CDNCredentials(host, downloader_access_key, downloader_access_secret, uploader_access_key, uploader_access_secret)
CDNManager
CDNManager(credentials: CDNCredentials)
CDNManager.upload(bucket: str, filename: str, file_bytes: bytearray) -> bool
CDNManager.download(bucket: str, filename: str) -> bool
ApiCredentials
ApiCredentials(url, email, password)
ApiClient
ApiClient()
ApiClient.login() -> None
ApiClient.upload_file(filename: str, file_bytes: bytearray, folder: str) -> None
ApiClient.load_bytes(filename: str, folder: str) -> bytes
ApiClient.load_big_small_resource(resource_name: str, folder: str, key: str) -> bytes
ApiClient.upload_big_small_resource(resource: bytes, resource_name: str, folder: str, key: str) -> None
External API Specification
Azaion REST API (consumed)
| Endpoint | Method | Auth | Description |
|---|---|---|---|
/login |
POST | None (returns JWT) | {"email": ..., "password": ...} → {"token": ...} |
/resources/{folder} |
POST | Bearer JWT | Multipart file upload |
/resources/get/{folder} |
POST | Bearer JWT | Download encrypted resource (sends hardware info in body) |
S3-compatible CDN
| Operation | Description |
|---|---|
upload_fileobj |
Upload bytes to S3 bucket |
download_file |
Download file from S3 bucket to disk |
Data Access Patterns
- API Client reads
config.yamlon init for API credentials - CDN credentials loaded by API Client from encrypted
cdn.yaml(downloaded from API) - Split resources: big part stored locally + CDN, small part on API server
Implementation Details
- JWT auto-refresh: On 401/403 response, automatically re-authenticates and retries
- Split-resource pattern: Encrypts data → splits at ~20% (SMALL_SIZE_KB * 1024 min) boundary → small part to API, big part to CDN. Neither part alone can reconstruct the original.
- CDN credential isolation: Separate S3 access keys for upload vs download (least-privilege)
- CDN self-bootstrap:
cdn.yamlcredentials are themselves encrypted and downloaded from the API during ApiClient init
Caveats
- Credentials hardcoded in
config.yamlandcdn.yaml— not using environment variables or secrets manager cdn_manager.download()saves to current working directory with the same filename- No retry logic beyond JWT refresh (no exponential backoff, no connection retry)
CDNManagerimportssys,yaml,osbut doesn't use them
Dependency Graph
graph TD
constants --> api_client
security --> api_client
hardware_service --> api_client
cdn_manager --> api_client
api_client --> exports
api_client --> train
api_client --> start_inference
cdn_manager --> exports
cdn_manager --> train
Logging Strategy
Print statements for upload/download confirmations and errors. No structured logging.