Refactor constants management to use Pydantic BaseModel for configuration

- Replaced module-level path variables in constants.py with a structured Pydantic Config class.
- Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure.
- Fixed bugs related to image processing and model saving.
- Enhanced test infrastructure to accommodate the new configuration approach.

This refactor improves code maintainability and clarity by centralizing configuration management.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-27 18:18:30 +02:00
parent b68c07b540
commit 142c6c4de8
106 changed files with 5706 additions and 654 deletions
@@ -0,0 +1,90 @@
# Component: API & CDN Client
## Overview
Communication layer for the Azaion backend API and S3-compatible CDN. Handles authentication, encrypted file transfer, and the split-resource pattern for secure model distribution.
**Pattern**: Client library with split-storage resource management
**Upstream**: Core (constants), Security (encryption, hardware identity)
**Downstream**: Training, Inference, Exports
## Modules
- `api_client` — REST client for Azaion API, JWT auth, encrypted resource download/upload, split big/small pattern
- `cdn_manager` — boto3 S3 client with separate read/write credentials
## Internal Interfaces
### CDNCredentials
```python
CDNCredentials(host, downloader_access_key, downloader_access_secret, uploader_access_key, uploader_access_secret)
```
### CDNManager
```python
CDNManager(credentials: CDNCredentials)
CDNManager.upload(bucket: str, filename: str, file_bytes: bytearray) -> bool
CDNManager.download(bucket: str, filename: str) -> bool
```
### ApiCredentials
```python
ApiCredentials(url, email, password)
```
### ApiClient
```python
ApiClient()
ApiClient.login() -> None
ApiClient.upload_file(filename: str, file_bytes: bytearray, folder: str) -> None
ApiClient.load_bytes(filename: str, folder: str) -> bytes
ApiClient.load_big_small_resource(resource_name: str, folder: str, key: str) -> bytes
ApiClient.upload_big_small_resource(resource: bytes, resource_name: str, folder: str, key: str) -> None
```
## External API Specification
### Azaion REST API (consumed)
| Endpoint | Method | Auth | Description |
|----------|--------|------|-------------|
| `/login` | POST | None (returns JWT) | `{"email": ..., "password": ...}``{"token": ...}` |
| `/resources/{folder}` | POST | Bearer JWT | Multipart file upload |
| `/resources/get/{folder}` | POST | Bearer JWT | Download encrypted resource (sends hardware info in body) |
### S3-compatible CDN
| Operation | Description |
|-----------|-------------|
| `upload_fileobj` | Upload bytes to S3 bucket |
| `download_file` | Download file from S3 bucket to disk |
## Data Access Patterns
- API Client reads `config.yaml` on init for API credentials
- CDN credentials loaded by API Client from encrypted `cdn.yaml` (downloaded from API)
- Split resources: big part stored locally + CDN, small part on API server
## Implementation Details
- **JWT auto-refresh**: On 401/403 response, automatically re-authenticates and retries
- **Split-resource pattern**: Encrypts data → splits at ~20% (SMALL_SIZE_KB * 1024 min) boundary → small part to API, big part to CDN. Neither part alone can reconstruct the original.
- **CDN credential isolation**: Separate S3 access keys for upload vs download (least-privilege)
- **CDN self-bootstrap**: `cdn.yaml` credentials are themselves encrypted and downloaded from the API during ApiClient init
## Caveats
- Credentials hardcoded in `config.yaml` and `cdn.yaml` — not using environment variables or secrets manager
- `cdn_manager.download()` saves to current working directory with the same filename
- No retry logic beyond JWT refresh (no exponential backoff, no connection retry)
- `CDNManager` imports `sys`, `yaml`, `os` but doesn't use them
## Dependency Graph
```mermaid
graph TD
constants --> api_client
security --> api_client
hardware_service --> api_client
cdn_manager --> api_client
api_client --> exports
api_client --> train
api_client --> start_inference
cdn_manager --> exports
cdn_manager --> train
```
## Logging Strategy
Print statements for upload/download confirmations and errors. No structured logging.