- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive. - Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object. - Updated media hashing to include a new function for computing hashes directly from files with minimal I/O. - Enhanced documentation to reflect changes in video processing and API behavior. Made-with: Cursor
3.0 KiB
Deployment Procedures
Deployment Strategy
Pattern: Rolling deployment (single-instance service; blue-green not applicable without load balancer).
Rationale: The Detections service is a single-instance stateless service with in-memory state (_active_detections, _event_queues). Rolling replacement with graceful shutdown is the simplest approach.
Zero-downtime: Achievable only with load-balanced multi-instance setup. For single-instance deployments, brief downtime (~10-30s) during container restart is expected.
Deployment Flow
1. Pull new images → pull-images.sh
2. Stop current services gracefully → stop-services.sh (30s grace)
3. Start new services → start-services.sh
4. Verify health → health-check.sh
5. If unhealthy → rollback (deploy.sh --rollback)
Health Checks
| Check | Type | Endpoint | Interval | Threshold |
|---|---|---|---|---|
| Liveness | HTTP GET | /health |
30s | 3 failures → restart |
| Readiness | HTTP GET | /health (check aiAvailability != "None") |
10s | Wait for engine init |
Note: The service's /health endpoint returns {"status": "healthy"} immediately, with aiAvailability transitioning through DOWNLOADING → CONVERTING → ENABLED on first inference request (lazy init). Readiness depends on the use case — the API is ready to accept requests immediately, but inference is only ready after engine initialization.
Rollback Procedures
Trigger criteria:
- Health check fails after deployment (3 consecutive failures)
- Error rate spike detected in monitoring
- Critical bug reported by users
Rollback steps:
- Run
deploy.sh --rollback— redeploys the previous image tag - Verify health via
health-check.sh - Notify stakeholders of rollback
- Investigate root cause
Previous image tag: Saved by stop-services.sh to .deploy-previous-tag before each deployment.
Deployment Checklist
Pre-deployment:
- All CI pipeline stages pass (lint, test, security, e2e)
- Security scan clean (zero critical/high CVEs)
- Environment variables configured on target
classes.jsonavailable on target (or mounted volume)- ONNX model available in Loader service
- Loader and Annotations services reachable from target
Post-deployment:
/healthreturns 200aiAvailabilitytransitions to ENABLED after first request- SSE stream endpoint accessible
- Detection request returns valid results
- Logs show no errors
Graceful Shutdown
Current limitation: No graceful shutdown implemented. In-progress detections are aborted on container stop. Background TensorRT conversion runs in a daemon thread (terminates with process).
Mitigation: The stop-services.sh script sends SIGTERM with a 30-second grace period, allowing uvicorn to finish in-flight HTTP requests. Long-running video detections may be interrupted.
Database Migrations
Not applicable — the Detections service has no database. All persistence is delegated to the Annotations service.