mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 21:56:33 +00:00
be4cab4fcb
- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive. - Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object. - Updated media hashing to include a new function for computing hashes directly from files with minimal I/O. - Enhanced documentation to reflect changes in video processing and API behavior. Made-with: Cursor
71 lines
3.0 KiB
Markdown
71 lines
3.0 KiB
Markdown
# Deployment Procedures
|
|
|
|
## Deployment Strategy
|
|
|
|
**Pattern**: Rolling deployment (single-instance service; blue-green not applicable without load balancer).
|
|
|
|
**Rationale**: The Detections service is a single-instance stateless service with in-memory state (`_active_detections`, `_event_queues`). Rolling replacement with graceful shutdown is the simplest approach.
|
|
|
|
**Zero-downtime**: Achievable only with load-balanced multi-instance setup. For single-instance deployments, brief downtime (~10-30s) during container restart is expected.
|
|
|
|
## Deployment Flow
|
|
|
|
```
|
|
1. Pull new images → pull-images.sh
|
|
2. Stop current services gracefully → stop-services.sh (30s grace)
|
|
3. Start new services → start-services.sh
|
|
4. Verify health → health-check.sh
|
|
5. If unhealthy → rollback (deploy.sh --rollback)
|
|
```
|
|
|
|
## Health Checks
|
|
|
|
| Check | Type | Endpoint | Interval | Threshold |
|
|
|-------|------|----------|----------|-----------|
|
|
| Liveness | HTTP GET | `/health` | 30s | 3 failures → restart |
|
|
| Readiness | HTTP GET | `/health` (check `aiAvailability` != "None") | 10s | Wait for engine init |
|
|
|
|
**Note**: The service's `/health` endpoint returns `{"status": "healthy"}` immediately, with `aiAvailability` transitioning through DOWNLOADING → CONVERTING → ENABLED on first inference request (lazy init). Readiness depends on the use case — the API is ready to accept requests immediately, but inference is only ready after engine initialization.
|
|
|
|
## Rollback Procedures
|
|
|
|
**Trigger criteria**:
|
|
- Health check fails after deployment (3 consecutive failures)
|
|
- Error rate spike detected in monitoring
|
|
- Critical bug reported by users
|
|
|
|
**Rollback steps**:
|
|
1. Run `deploy.sh --rollback` — redeploys the previous image tag
|
|
2. Verify health via `health-check.sh`
|
|
3. Notify stakeholders of rollback
|
|
4. Investigate root cause
|
|
|
|
**Previous image tag**: Saved by `stop-services.sh` to `.deploy-previous-tag` before each deployment.
|
|
|
|
## Deployment Checklist
|
|
|
|
Pre-deployment:
|
|
- [ ] All CI pipeline stages pass (lint, test, security, e2e)
|
|
- [ ] Security scan clean (zero critical/high CVEs)
|
|
- [ ] Environment variables configured on target
|
|
- [ ] `classes.json` available on target (or mounted volume)
|
|
- [ ] ONNX model available in Loader service
|
|
- [ ] Loader and Annotations services reachable from target
|
|
|
|
Post-deployment:
|
|
- [ ] `/health` returns 200
|
|
- [ ] `aiAvailability` transitions to ENABLED after first request
|
|
- [ ] SSE stream endpoint accessible
|
|
- [ ] Detection request returns valid results
|
|
- [ ] Logs show no errors
|
|
|
|
## Graceful Shutdown
|
|
|
|
**Current limitation**: No graceful shutdown implemented. In-progress detections are aborted on container stop. Background TensorRT conversion runs in a daemon thread (terminates with process).
|
|
|
|
**Mitigation**: The `stop-services.sh` script sends SIGTERM with a 30-second grace period, allowing uvicorn to finish in-flight HTTP requests. Long-running video detections may be interrupted.
|
|
|
|
## Database Migrations
|
|
|
|
Not applicable — the Detections service has no database. All persistence is delegated to the Annotations service.
|