Files
detections/_docs/04_deploy/deployment_procedures.md
T
Oleksandr Bezdieniezhnykh be4cab4fcb [AZ-178] Implement streaming video detection endpoint
- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive.
- Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object.
- Updated media hashing to include a new function for computing hashes directly from files with minimal I/O.
- Enhanced documentation to reflect changes in video processing and API behavior.

Made-with: Cursor
2026-04-01 03:11:43 +03:00

71 lines
3.0 KiB
Markdown

# Deployment Procedures
## Deployment Strategy
**Pattern**: Rolling deployment (single-instance service; blue-green not applicable without load balancer).
**Rationale**: The Detections service is a single-instance stateless service with in-memory state (`_active_detections`, `_event_queues`). Rolling replacement with graceful shutdown is the simplest approach.
**Zero-downtime**: Achievable only with load-balanced multi-instance setup. For single-instance deployments, brief downtime (~10-30s) during container restart is expected.
## Deployment Flow
```
1. Pull new images → pull-images.sh
2. Stop current services gracefully → stop-services.sh (30s grace)
3. Start new services → start-services.sh
4. Verify health → health-check.sh
5. If unhealthy → rollback (deploy.sh --rollback)
```
## Health Checks
| Check | Type | Endpoint | Interval | Threshold |
|-------|------|----------|----------|-----------|
| Liveness | HTTP GET | `/health` | 30s | 3 failures → restart |
| Readiness | HTTP GET | `/health` (check `aiAvailability` != "None") | 10s | Wait for engine init |
**Note**: The service's `/health` endpoint returns `{"status": "healthy"}` immediately, with `aiAvailability` transitioning through DOWNLOADING → CONVERTING → ENABLED on first inference request (lazy init). Readiness depends on the use case — the API is ready to accept requests immediately, but inference is only ready after engine initialization.
## Rollback Procedures
**Trigger criteria**:
- Health check fails after deployment (3 consecutive failures)
- Error rate spike detected in monitoring
- Critical bug reported by users
**Rollback steps**:
1. Run `deploy.sh --rollback` — redeploys the previous image tag
2. Verify health via `health-check.sh`
3. Notify stakeholders of rollback
4. Investigate root cause
**Previous image tag**: Saved by `stop-services.sh` to `.deploy-previous-tag` before each deployment.
## Deployment Checklist
Pre-deployment:
- [ ] All CI pipeline stages pass (lint, test, security, e2e)
- [ ] Security scan clean (zero critical/high CVEs)
- [ ] Environment variables configured on target
- [ ] `classes.json` available on target (or mounted volume)
- [ ] ONNX model available in Loader service
- [ ] Loader and Annotations services reachable from target
Post-deployment:
- [ ] `/health` returns 200
- [ ] `aiAvailability` transitions to ENABLED after first request
- [ ] SSE stream endpoint accessible
- [ ] Detection request returns valid results
- [ ] Logs show no errors
## Graceful Shutdown
**Current limitation**: No graceful shutdown implemented. In-progress detections are aborted on container stop. Background TensorRT conversion runs in a daemon thread (terminates with process).
**Mitigation**: The `stop-services.sh` script sends SIGTERM with a 30-second grace period, allowing uvicorn to finish in-flight HTTP requests. Long-running video detections may be interrupted.
## Database Migrations
Not applicable — the Detections service has no database. All persistence is delegated to the Annotations service.