ai-training/_docs/00_problem/security_approach.md

# Security Approach

## Authentication

- **API Authentication**: JWT-based. Client sends email/password to `POST /login`, receives JWT token used as Bearer token for subsequent requests.
- **Auto-relogin**: On HTTP 401/403 responses, the client automatically re-authenticates and retries the request.

## Encryption

- **Model encryption**: AES-256-CBC with a static key defined in `security.py`. All model artifacts (ONNX, TensorRT) are encrypted before upload.
- **Resource encryption**: AES-256-CBC with a hardware-derived key. The key is generated by hashing the machine's CPU model, GPU name, total RAM, and primary drive serial number. This ties decryption to the specific hardware.
- **Implementation**: Uses the `cryptography` library with PKCS7 padding. IV is prepended to ciphertext.

## Model Protection

- **Split storage**: Encrypted models are split into a small part (≤3KB or 20% of total size) stored on the Azaion API server and a big part stored on S3-compatible CDN. Both parts are required to reconstruct the model.
- **Hardware binding**: Inference clients must run on authorized hardware whose fingerprint matches the encryption key used during upload.

## Access Control

- **CDN access**: Separate read-only and write-only S3 credentials. Training uploads use write keys; inference downloads use read keys.
- **Role-based annotation routing**: Validator/Admin annotations go directly to validated storage; Operator annotations go to seed storage pending validation.

## Known Security Issues

| Issue | Severity | Location |
|-------|----------|----------|
| Hardcoded API credentials (email, password) | High | config.yaml |
| Hardcoded CDN access keys (4 keys) | High | cdn.yaml |
| Hardcoded model encryption key | High | security.py:67 |
| Queue credentials in plaintext | Medium | config.yaml, annotation-queue/config.yaml |
| No TLS certificate validation | Low | api_client.py |
| No input validation on API responses | Low | api_client.py |