Quality cleanup refactoring

Made-with: Cursor
2026-04-22 08:06:34 +00:00 · 2026-04-13 06:21:26 +03:00
parent 8f7deb3fca
commit 4eaf218f09
33 changed files with 957 additions and 207 deletions
@@ -83,6 +83,7 @@ No tests exist. Coverage is 0% across all categories.
 | 3 | Should endpoint-level authorization be enforced? | Security — currently all endpoints accessible post-login | Team |
 | 4 | Should the resource encryption key be per-user instead of shared? | Security — currently all users share one key for big/small split | Team |
 | 5 | What are the target latency/throughput requirements? | Performance — no SLAs defined | Product |
+| 6 | Investigate replacing binary-split security with TPM on Jetson Orin Nano | Architecture — the binary-split model was designed for untrusted end-user laptops; SaaS/edge deployment on Jetson Orin Nano can use TPM instead, potentially simplifying the loader significantly | Team |

 ## Artifact Index

@@ -17,7 +17,6 @@ Central API client that orchestrates authentication, encrypted resource download
 | token       | str         | JWT bearer token                   |
 | cdn_manager | CDNManager  | CDN upload/download client         |
 | api_url     | str         | Base URL for the resource API      |
-| folder      | str         | Declared in `.pxd` but never assigned — dead attribute |

 #### Methods

@@ -28,17 +27,12 @@ Central API client that orchestrates authentication, encrypted resource download
 | `set_credentials`            | cdef       | `(self, Credentials credentials)`                                 | Internal: set credentials, lazy-init CDN manager             |
 | `login`                      | cdef       | `(self)`                                                          | POST `/login`, store JWT token                               |
 | `set_token`                  | cdef       | `(self, str token)`                                               | Decode JWT claims → create `User` with role mapping          |
-| `get_user`                   | cdef       | `(self) -> User`                                                  | Lazy login + return user                                     |
 | `request`                    | cdef       | `(self, str method, str url, object payload, bint is_stream)`     | Authenticated HTTP request with auto-retry on 401/403        |
-| `list_files`                 | cdef       | `(self, str folder, str search_file)`                             | GET `/resources/list/{folder}` with search param             |
-| `check_resource`             | cdef       | `(self)`                                                          | POST `/resources/check` with hardware fingerprint            |
 | `load_bytes`                 | cdef       | `(self, str filename, str folder) -> bytes`                       | Download + decrypt resource using per-user+hw key            |
-| `upload_file`                | cdef       | `(self, str filename, bytes resource, str folder)`                | POST multipart upload to `/resources/{folder}`               |
+| `upload_file`                | cdef       | `(self, str filename, bytes resource, str folder)`                | POST multipart upload to `/resources/{folder}`; raises on HTTP error |
 | `load_big_file_cdn`          | cdef       | `(self, str folder, str big_part) -> bytes`                       | Download large file part from CDN                            |
 | `load_big_small_resource`    | cpdef      | `(self, str resource_name, str folder) -> bytes`                  | Reassemble resource from small (API) + big (CDN/local) parts |
-| `upload_big_small_resource`  | cpdef      | `(self, bytes resource, str resource_name, str folder)`           | Split-encrypt and upload small part to API, big part to CDN  |
-| `upload_to_cdn`              | cpdef      | `(self, str bucket, str filename, bytes file_bytes)`              | Direct CDN upload                                            |
-| `download_from_cdn`          | cpdef      | `(self, str bucket, str filename) -> bytes`                       | Direct CDN download                                          |
+| `upload_big_small_resource`  | cpdef      | `(self, bytes resource, str resource_name, str folder)`           | Split-encrypt; CDN upload must succeed or raises; then small part via `upload_file` |

 ## Internal Logic

@@ -58,8 +52,8 @@ Central API client that orchestrates authentication, encrypted resource download
 ### Big/Small Resource Split (upload)
 1. Encrypts entire resource with shared resource key
 2. Splits: small part = `min(SMALL_SIZE_KB * 1024, 30% of encrypted)`, big part = remainder
-3. Uploads big part to CDN + saves local copy
-4. Uploads small part to API via multipart POST
+3. Calls `cdn_manager.upload` for the big part; raises if upload fails
+4. Writes big part to local cache, then uploads small part to API via `upload_file` (non-2xx responses propagate)

 ### JWT Role Mapping
 Maps `role` claim string to `RoleEnum`: ApiAdmin, Admin, ResourceUploader, Validator, Operator, or NONE (default).
@@ -89,7 +83,7 @@ The CDN config file is itself downloaded encrypted from the API on first credent

 ## External Integrations

- **Azaion Resource API**: `/login`, `/resources/get/{folder}`, `/resources/{folder}` (upload), `/resources/list/{folder}`, `/resources/check`
+- **Azaion Resource API**: `/login`, `/resources/get/{folder}`, `/resources/{folder}` (upload)
 - **S3 CDN**: via `CDNManager` for large file parts

 ## Security
@@ -11,7 +11,7 @@ Handles the encrypted Docker image archive workflow: downloading a key fragment
 | Function               | Signature                                                              | Description                                              |
 |------------------------|------------------------------------------------------------------------|----------------------------------------------------------|
 | `download_key_fragment`| `(resource_api_url: str, token: str) -> bytes`                         | GET request to `/binary-split/key-fragment` with Bearer auth |
-| `decrypt_archive`      | `(encrypted_path: str, key_fragment: bytes, output_path: str) -> None` | AES-256-CBC decryption with SHA-256 derived key; strips PKCS7 padding |
+| `decrypt_archive`      | `(encrypted_path: str, key_fragment: bytes, output_path: str) -> None` | AES-256-CBC stream decrypt with SHA-256 derived key; PKCS7 removed in-pipeline via unpadder |
 | `docker_load`          | `(tar_path: str) -> None`                                             | Runs `docker load -i <tar_path>` subprocess              |
 | `check_images_loaded`  | `(version: str) -> bool`                                              | Checks all `API_SERVICES` images exist for given version tag |

@@ -26,9 +26,8 @@ Handles the encrypted Docker image archive workflow: downloading a key fragment
 ### `decrypt_archive`
 1. Derives AES key: `SHA-256(key_fragment)` → 32-byte key
 2. Reads first 16 bytes as IV from encrypted file
-3. Decrypts remaining data in 64KB chunks using AES-256-CBC
-4. After decryption, reads last byte of output to determine PKCS7 padding length
-5. Truncates output file to remove padding
+3. Streams ciphertext in 64KB chunks through AES-256-CBC decryptor
+4. Feeds decrypted chunks through `padding.PKCS7(128).unpadder()`; writes unpadded bytes to the output file (`finalize` on decryptor and unpadder at end)

 ### `check_images_loaded`
 Iterates all 7 service image names, runs `docker image inspect <name>:<version>` for each. Returns `False` on first missing image.
@@ -36,7 +35,7 @@ Iterates all 7 service image names, runs `docker image inspect <name>:<version>`
 ## Dependencies

 - **Internal**: none (leaf module)
- **External**: `hashlib`, `os`, `subprocess` (stdlib), `requests` (2.32.4), `cryptography` (44.0.2)
+- **External**: `hashlib`, `subprocess` (stdlib), `requests` (2.32.4), `cryptography` (44.0.2)

 ## Consumers

@@ -8,15 +8,10 @@ Centralizes shared configuration constants and provides the application-wide log

 ### Constants (cdef, module-level)

-| Name                   | Type | Value                          |
-|------------------------|------|--------------------------------|
-| CONFIG_FILE            | str  | `"config.yaml"`                |
-| QUEUE_CONFIG_FILENAME  | str  | `"secured-config.json"`        |
-| AI_ONNX_MODEL_FILE    | str  | `"azaion.onnx"`                |
-| CDN_CONFIG             | str  | `"cdn.yaml"`                   |
-| MODELS_FOLDER          | str  | `"models"`                     |
-| SMALL_SIZE_KB          | int  | `3`                            |
-| ALIGNMENT_WIDTH        | int  | `32`                           |
+| Name          | Type | Value        |
+|---------------|------|--------------|
+| CDN_CONFIG    | str  | `"cdn.yaml"` |
+| SMALL_SIZE_KB | int  | `3`          |

 Note: `QUEUE_MAXSIZE`, `COMMANDS_QUEUE`, `ANNOTATIONS_QUEUE` are declared in the `.pxd` but not defined in the `.pyx` — they are unused in this codebase.

@@ -30,7 +25,7 @@ Note: `QUEUE_MAXSIZE`, `COMMANDS_QUEUE`, `ANNOTATIONS_QUEUE` are declared in the
 ## Internal Logic

 Loguru is configured with three sinks:
- **File sink**: `Logs/log_loader_{date}.txt`, INFO level, daily rotation, 30-day retention, async (enqueue=True)
+- **File sink**: under `LOG_DIR`, path template `log_loader_{time:YYYYMMDD}.txt`, INFO level, daily rotation, 30-day retention, async (enqueue=True)
 - **Stdout sink**: DEBUG level, filtered to INFO/DEBUG/SUCCESS only, colorized
 - **Stderr sink**: WARNING+ level, colorized

@@ -39,7 +34,7 @@ Log format: `[HH:mm:ss LEVEL] message`
 ## Dependencies

 - **Internal**: none (leaf module)
- **External**: `loguru` (0.7.3), `sys`, `time`
+- **External**: `loguru` (0.7.3), `os`, `sys`

 ## Consumers

@@ -53,7 +48,11 @@ None.

 ## Configuration

-No env vars consumed directly. Log file path is hardcoded to `Logs/log_loader_{date}.txt`.
+| Env Variable | Default | Description                          |
+|--------------|---------|--------------------------------------|
+| LOG_DIR      | `Logs`  | Directory for daily log files        |
+
+The file sink uses Loguru’s `{time:YYYYMMDD}` in the filename under `LOG_DIR`.

 ## External Integrations

@@ -33,29 +33,36 @@ FastAPI application entry point providing HTTP endpoints for health checks, auth

 ### Module-level State

-| Name           | Type               | Description                                    |
-|----------------|--------------------|------------------------------------------------|
-| api_client     | ApiClient or None  | Lazy-initialized singleton                     |
-| unlock_state   | UnlockState        | Current unlock workflow state                  |
-| unlock_error   | Optional[str]      | Last unlock error message                      |
-| unlock_lock    | threading.Lock     | Thread safety for unlock state mutations       |
+| Name              | Type                    | Description                                                    |
+|-------------------|-------------------------|----------------------------------------------------------------|
+| `_api_client`     | `ApiClient` or `None`   | Lazy-initialized singleton                                     |
+| `_api_client_lock`| `threading.Lock`        | Protects lazy initialization of `_api_client` (double-checked) |
+| `_unlock`         | `_UnlockStateHolder`    | Holds unlock workflow state and last error under an inner lock |
+
+#### `_UnlockStateHolder`
+
+| Member    | Description                                                                 |
+|-----------|-----------------------------------------------------------------------------|
+| `get()`   | Returns `(state: UnlockState, error: Optional[str])` under lock             |
+| `set(state, error=None)` | Sets state and optional error message under lock                 |
+| `state` (property) | Current `UnlockState` (read under lock)                              |

 ## Internal Logic

 ### `get_api_client()`
-Lazy singleton pattern: creates `ApiClient(RESOURCE_API_URL)` on first call.
+Double-checked locking: if `_api_client` is `None`, acquires `_api_client_lock`, re-checks, then imports `ApiClient` and constructs `ApiClient(RESOURCE_API_URL)` once.

 ### Unlock Workflow (`_run_unlock`)
 Background task (via FastAPI BackgroundTasks) that runs these steps:
-1. Check if Docker images already loaded → if yes, set `ready`
+1. Check if Docker images already loaded → if yes, set `ready` (preserving any prior error from `get()`)
 2. Authenticate with API (login)
 3. Download key fragment from `/binary-split/key-fragment`
 4. Decrypt archive at `IMAGES_PATH` → `.tar`
 5. `docker load` the tar file
-6. Clean up tar file
-7. Set state to `ready` (or `error` on failure)
+6. Remove tar file; on `OSError`, log a warning and continue
+7. Set state to `ready` with no error (or `error` on failure)

-State transitions are guarded by `unlock_lock` (threading.Lock).
+State and error are updated only through `_unlock.set()` and read via `_unlock.get()` / `_unlock.state`.

 ### `/unlock` Endpoint
 - If already `ready` → return immediately
@@ -65,8 +72,8 @@ State transitions are guarded by `unlock_lock` (threading.Lock).

 ## Dependencies

- **Internal**: `unlock_state` (UnlockState enum), `api_client` (lazy import), `binary_split` (lazy import)
- **External**: `os`, `threading` (stdlib), `fastapi`, `pydantic`
+- **Internal**: `UnlockState` from `unlock_state`, `get_api_client()` (lazy `api_client` import), `binary_split` (lazy import in unlock paths)
+- **External**: `os`, `threading` (stdlib), `fastapi`, `pydantic`, `loguru` (logger for tar cleanup warnings)

 ## Consumers

@@ -94,7 +101,7 @@ None — this is the entry point module.

 - Login endpoint returns 401 on auth failure
 - All resource endpoints use authenticated API client
- Unlock state is thread-safe via `threading.Lock`
+- Unlock state and error are guarded by `_UnlockStateHolder`’s lock; API client initialization is guarded by `_api_client_lock`
 - Lazy imports of Cython modules (`api_client`, `binary_split`) to avoid import-time side effects

 ## Tests
@@ -15,7 +15,7 @@ All methods are `@staticmethod cdef` — Cython-only visibility, not callable fr
 | Method                      | Signature                                                       | Description                                                          |
 |-----------------------------|-----------------------------------------------------------------|----------------------------------------------------------------------|
 | `encrypt_to`                | `(input_bytes, key) -> bytes`                                   | AES-256-CBC encrypt with random IV, PKCS7 padding; returns `IV + ciphertext` |
-| `decrypt_to`                | `(ciphertext_with_iv_bytes, key) -> bytes`                      | AES-256-CBC decrypt; first 16 bytes = IV; manual PKCS7 unpad        |
+| `decrypt_to`                | `(ciphertext_with_iv_bytes, key) -> bytes`                      | AES-256-CBC decrypt; first 16 bytes = IV; PKCS7 via `padding.PKCS7(128).unpadder()` |
 | `get_hw_hash`               | `(str hardware) -> str`                                         | Derives hardware hash: `SHA-384("Azaion_{hardware}_%$$$)0_")` → base64 |
 | `get_api_encryption_key`    | `(Credentials creds, str hardware_hash) -> str`                 | Derives per-user+hw key: `SHA-384("{email}-{password}-{hw_hash}-#%@AzaionKey@%#---")` → base64 |
 | `get_resource_encryption_key`| `() -> str`                                                    | Returns fixed shared key: `SHA-384("-#%@AzaionKey@%#---234sdfklgvhjbnn")` → base64 |
@@ -40,7 +40,7 @@ All methods are `@staticmethod cdef` — Cython-only visibility, not callable fr
 1. SHA-256 hash of string key → 32-byte AES key
 2. Split input: first 16 bytes = IV, rest = ciphertext
 3. AES-CBC decrypt
-4. Manual PKCS7 unpadding: read last byte as padding length; strip if 1–16
+4. PKCS7 removal via `cryptography` `padding.PKCS7(128).unpadder()` (`update` + `finalize`)

 ### Key Derivation Hierarchy
 - **Hardware hash**: salted hardware fingerprint → SHA-384 → base64