- Updated the detection image endpoint to require a channel ID for event streaming.
- Introduced a new endpoint for streaming detection events, allowing clients to receive real-time updates.
- Enhanced the internal buffering mechanism for detection events to manage multiple channels.
- Refactored the inference module to support the new event handling structure.
Made-with: Cursor
- Pin all deps; h11==0.16.0 (CVE-2025-43859), python-multipart>=1.3.1 (CVE-2026-28356), PyJWT==2.12.1
- Add HMAC JWT verification (require_auth FastAPI dependency, JWT_SECRET-gated)
- Fix TokenManager._refresh() to use ADMIN_API_URL instead of ANNOTATIONS_URL
- Rename POST /detect → POST /detect/image (image-only, rejects video files)
- Replace global SSE stream with per-job SSE: GET /detect/{media_id} with event replay buffer
- Apply require_auth to all 4 protected endpoints
- Fix on_annotation/on_status closure to use mutable current_id for correct post-upload event routing
- Add non-root appuser to Dockerfile and Dockerfile.gpu
- Add JWT_SECRET to e2e/docker-compose.test.yml and run-tests.sh
- Update all e2e tests and unit tests for new endpoints and HMAC token signing
- 64/64 tests pass
Made-with: Cursor
- Updated various Cython files to explicitly cast types, enhancing type safety and readability.
- Adjusted the `engine_name` property in `InferenceEngine` and its subclasses to be set directly in the constructor.
- Modified the `request` method in `_SessionWithBase` to accept `*args` for better flexibility.
- Ensured proper type casting for return values in methods across multiple classes, including `Inference`, `CoreMLEngine`, and `TensorRTEngine`.
These changes aim to streamline the codebase and improve maintainability by enforcing consistent type usage.
- Modified the health endpoint to return "None" for AI availability when inference is not initialized, improving clarity on system status.
- Enhanced the test documentation to include handling of skipped tests, emphasizing the need for investigation before proceeding.
- Updated test assertions to ensure proper execution order and prevent premature engine initialization.
- Refactored test cases to streamline performance testing and improve readability, removing unnecessary complexity.
These changes aim to enhance the robustness of the health check and improve the overall testing framework.
- Updated the `Inference` class to replace the `get_onnx_engine_bytes` method with `download_model`, allowing for dynamic model loading based on a specified filename.
- Modified the `convert_and_upload_model` method to accept `source_bytes` instead of `onnx_engine_bytes`, enhancing flexibility in model conversion.
- Introduced a new property `engine_name` to the `Inference` class for better access to engine details.
- Adjusted the `AIRecognitionConfig` structure to include a new method pointer `from_dict`, improving configuration handling.
- Updated various test cases to reflect changes in model paths and timeout settings, ensuring consistency and reliability in testing.