mirror of
https://github.com/azaion/loader.git
synced 2026-04-22 09:46:32 +00:00
Quality cleanup refactoring
Made-with: Cursor
This commit is contained in:
@@ -0,0 +1,77 @@
|
||||
# Fix Crypto Padding and Upload Error Handling
|
||||
|
||||
**Task**: 06_refactor_crypto_uploads
|
||||
**Name**: Fix crypto padding and upload error propagation
|
||||
**Description**: Replace manual PKCS7 unpadding with library implementation and propagate upload failures instead of swallowing them
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: None
|
||||
**Component**: Security, Resource Management
|
||||
**Tracker**: PENDING
|
||||
**Epic**: PENDING (01-quality-cleanup)
|
||||
|
||||
## Problem
|
||||
|
||||
The decryption path uses manual PKCS7 padding removal that only checks the last byte instead of validating all padding bytes. Corrupted or tampered ciphertext silently produces garbage output. Additionally, resource upload failures (both CDN and API) are silently swallowed — the caller reports success when the upload actually failed.
|
||||
|
||||
## Outcome
|
||||
|
||||
- Decryption raises on invalid padding instead of returning garbage
|
||||
- Upload failures propagate to the HTTP endpoint and return appropriate error responses
|
||||
- The encrypt→decrypt roundtrip uses the same library for both directions
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Replace manual unpadding in Security.decrypt_to with library PKCS7 unpadder
|
||||
- Replace manual padding removal in binary_split.decrypt_archive with library unpadder
|
||||
- Check cdn_manager.upload return value in upload_big_small_resource
|
||||
- Let upload_file exceptions propagate instead of catching and logging
|
||||
|
||||
### Excluded
|
||||
- Changing encryption (encrypt_to) — already uses the library correctly
|
||||
- Modifying CDNManager.upload/download internals
|
||||
- Changing the binary-split scheme itself
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Library unpadder in Security.decrypt_to**
|
||||
Given an encrypted resource produced by encrypt_to
|
||||
When decrypt_to is called
|
||||
Then it uses padding.PKCS7(128).unpadder() instead of manual byte inspection
|
||||
|
||||
**AC-2: Library unpadder in decrypt_archive**
|
||||
Given an encrypted Docker image archive
|
||||
When decrypt_archive is called
|
||||
Then padding is removed using the cryptography library, not manual file truncation
|
||||
|
||||
**AC-3: CDN upload failure raises**
|
||||
Given cdn_manager.upload returns False
|
||||
When upload_big_small_resource is called
|
||||
Then an exception is raised before the method returns
|
||||
|
||||
**AC-4: API upload failure propagates**
|
||||
Given the Resource API is unreachable during upload
|
||||
When upload_file is called
|
||||
Then the exception propagates to the caller
|
||||
|
||||
**AC-5: Roundtrip still works**
|
||||
Given a resource is uploaded via upload_big_small_resource
|
||||
When it is downloaded via load_big_small_resource
|
||||
Then the original content is returned unchanged
|
||||
|
||||
## Blackbox Tests
|
||||
|
||||
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|
||||
|--------|------------------------|-------------|-------------------|----------------|
|
||||
| AC-5 | Docker services running | Upload then download a resource | Content matches original | — |
|
||||
|
||||
## Constraints
|
||||
|
||||
- security.pyx is Cython — changes must be valid Cython syntax
|
||||
- binary_split.py uses streaming file I/O — unpadding must work with the existing chunk-based approach
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Existing encrypted data with non-standard padding**
|
||||
- *Risk*: If any previously encrypted data has irregular padding bytes, the library unpadder will raise ValueError
|
||||
- *Mitigation*: The e2e test_upload_download_roundtrip validates the full encrypt→decrypt path; all existing data was produced by encrypt_to which uses the same library padder
|
||||
@@ -0,0 +1,66 @@
|
||||
# Thread Safety in Main Module
|
||||
|
||||
**Task**: 07_refactor_thread_safety
|
||||
**Name**: Thread-safe singleton and encapsulated unlock state
|
||||
**Description**: Add thread-safe initialization for the ApiClient singleton and encapsulate unlock state management
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: None
|
||||
**Component**: HTTP API
|
||||
**Tracker**: PENDING
|
||||
**Epic**: PENDING (01-quality-cleanup)
|
||||
|
||||
## Problem
|
||||
|
||||
The ApiClient singleton in main.py is initialized without a lock — concurrent requests can create duplicate instances. The unlock workflow state is managed through module-level globals, scattering state transitions across multiple functions.
|
||||
|
||||
## Outcome
|
||||
|
||||
- ApiClient singleton initialization is thread-safe under concurrent HTTP requests
|
||||
- Unlock state is encapsulated in a dedicated holder with thread-safe accessors
|
||||
- No change to external behavior or API responses
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Add threading.Lock for ApiClient singleton initialization (double-checked locking)
|
||||
- Create a state holder class for unlock_state and unlock_error with lock-guarded methods
|
||||
- Update all unlock state reads/writes to use the holder
|
||||
|
||||
### Excluded
|
||||
- Changing the ApiClient class itself (api_client.pyx)
|
||||
- Modifying the unlock workflow logic or state machine transitions
|
||||
- Adding new endpoints or changing API contracts
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Thread-safe singleton**
|
||||
Given multiple concurrent requests hitting any endpoint
|
||||
When get_api_client() is called simultaneously
|
||||
Then exactly one ApiClient instance is created
|
||||
|
||||
**AC-2: Encapsulated unlock state**
|
||||
Given the unlock workflow is in progress
|
||||
When unlock/status is queried
|
||||
Then state is read through a thread-safe accessor, not via bare globals
|
||||
|
||||
**AC-3: Existing behavior preserved**
|
||||
Given the current e2e test suite
|
||||
When all tests are run
|
||||
Then all 18 tests pass with no regressions
|
||||
|
||||
## Blackbox Tests
|
||||
|
||||
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|
||||
|--------|------------------------|-------------|-------------------|----------------|
|
||||
| AC-3 | Docker services running | Full e2e suite | 18 passed, 0 failed | — |
|
||||
|
||||
## Constraints
|
||||
|
||||
- main.py is pure Python — no Cython syntax constraints
|
||||
- Must preserve FastAPI's BackgroundTasks compatibility for the unlock flow
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Lock contention on high-concurrency paths**
|
||||
- *Risk*: Adding a lock to get_api_client could slow concurrent requests
|
||||
- *Mitigation*: Double-checked locking means the lock is only acquired once during initialization; subsequent calls check the fast path without locking
|
||||
@@ -0,0 +1,75 @@
|
||||
# Dead Code Removal and Minor Fixes
|
||||
|
||||
**Task**: 08_refactor_cleanup
|
||||
**Name**: Remove dead code, fix log path and error handling
|
||||
**Description**: Remove orphan methods and constants, make log path configurable, log os.remove failure
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: 06_refactor_crypto_uploads
|
||||
**Component**: Resource Management, Core Models, HTTP API
|
||||
**Tracker**: PENDING
|
||||
**Epic**: PENDING (01-quality-cleanup)
|
||||
|
||||
## Problem
|
||||
|
||||
The codebase contains 5 never-called methods in ApiClient and 8 orphan constant declarations. The log file path is hardcoded with no environment override. A file removal error is silently swallowed.
|
||||
|
||||
## Outcome
|
||||
|
||||
- Dead methods and constants are removed from source and declaration files
|
||||
- Log file directory is configurable via environment variable
|
||||
- File removal failure is logged instead of silently ignored
|
||||
- Codebase is smaller and cleaner with no behavioral regressions
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Delete 5 orphan methods from api_client.pyx: get_user, list_files, check_resource, upload_to_cdn, download_from_cdn
|
||||
- Delete corresponding declarations from api_client.pxd
|
||||
- Delete 5 unused constants from constants.pyx: CONFIG_FILE, QUEUE_CONFIG_FILENAME, AI_ONNX_MODEL_FILE, MODELS_FOLDER, ALIGNMENT_WIDTH
|
||||
- Delete 8 orphan declarations from constants.pxd (keep CDN_CONFIG, SMALL_SIZE_KB, log, logerror)
|
||||
- Make log directory configurable via LOG_DIR env var in constants.pyx
|
||||
- Replace bare except: pass with warning log in main.py _run_unlock
|
||||
|
||||
### Excluded
|
||||
- Modifying any live code paths or method signatures
|
||||
- Changing the logging format or levels
|
||||
- Removing hardware_service.pyx silent catches (those are by-design for cross-platform compatibility)
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Dead methods removed**
|
||||
Given the source code
|
||||
When searching for get_user, list_files, check_resource, upload_to_cdn, download_from_cdn
|
||||
Then no definitions or declarations exist in api_client.pyx or api_client.pxd
|
||||
|
||||
**AC-2: Dead constants removed**
|
||||
Given constants.pyx and constants.pxd
|
||||
When the files are inspected
|
||||
Then only CDN_CONFIG, SMALL_SIZE_KB, log, logerror declarations remain in the pxd
|
||||
|
||||
**AC-3: Configurable log path**
|
||||
Given LOG_DIR environment variable is set
|
||||
When the application starts
|
||||
Then logs are written to the specified directory
|
||||
|
||||
**AC-4: Error logged on tar removal failure**
|
||||
Given os.remove fails on the tar file during unlock
|
||||
When the failure occurs
|
||||
Then a warning-level log message is emitted
|
||||
|
||||
**AC-5: No regressions**
|
||||
Given the current e2e test suite
|
||||
When all tests are run
|
||||
Then all 18 tests pass
|
||||
|
||||
## Blackbox Tests
|
||||
|
||||
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|
||||
|--------|------------------------|-------------|-------------------|----------------|
|
||||
| AC-5 | Docker services running | Full e2e suite | 18 passed, 0 failed | — |
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Removing a method that's called via dynamic dispatch**
|
||||
- *Risk*: A method could be invoked dynamically (getattr, etc.) rather than statically
|
||||
- *Mitigation*: All removed methods are cdef/cpdef — cdef methods cannot be called dynamically from Python; cpdef methods were grep-verified to have zero callers
|
||||
Reference in New Issue
Block a user