# Deployment Procedures ## Deployment Strategy **Pattern**: Rolling deployment (single-instance service; blue-green not applicable without load balancer). **Rationale**: The Detections service is a single-instance stateless service with in-memory state (`_active_detections`, `_event_queues`). Rolling replacement with graceful shutdown is the simplest approach. **Zero-downtime**: Achievable only with load-balanced multi-instance setup. For single-instance deployments, brief downtime (~10-30s) during container restart is expected. ## Deployment Flow ``` 1. Pull new images → pull-images.sh 2. Stop current services gracefully → stop-services.sh (30s grace) 3. Start new services → start-services.sh 4. Verify health → health-check.sh 5. If unhealthy → rollback (deploy.sh --rollback) ``` ## Health Checks | Check | Type | Endpoint | Interval | Threshold | |-------|------|----------|----------|-----------| | Liveness | HTTP GET | `/health` | 30s | 3 failures → restart | | Readiness | HTTP GET | `/health` (check `aiAvailability` != "None") | 10s | Wait for engine init | **Note**: The service's `/health` endpoint returns `{"status": "healthy"}` immediately, with `aiAvailability` transitioning through DOWNLOADING → CONVERTING → ENABLED on first inference request (lazy init). Readiness depends on the use case — the API is ready to accept requests immediately, but inference is only ready after engine initialization. ## Rollback Procedures **Trigger criteria**: - Health check fails after deployment (3 consecutive failures) - Error rate spike detected in monitoring - Critical bug reported by users **Rollback steps**: 1. Run `deploy.sh --rollback` — redeploys the previous image tag 2. Verify health via `health-check.sh` 3. Notify stakeholders of rollback 4. Investigate root cause **Previous image tag**: Saved by `stop-services.sh` to `.deploy-previous-tag` before each deployment. ## Deployment Checklist Pre-deployment: - [ ] All CI pipeline stages pass (lint, test, security, e2e) - [ ] Security scan clean (zero critical/high CVEs) - [ ] Environment variables configured on target - [ ] `classes.json` available on target (or mounted volume) - [ ] ONNX model available in Loader service - [ ] Loader and Annotations services reachable from target Post-deployment: - [ ] `/health` returns 200 - [ ] `aiAvailability` transitions to ENABLED after first request - [ ] SSE stream endpoint accessible - [ ] Detection request returns valid results - [ ] Logs show no errors ## Graceful Shutdown **Current limitation**: No graceful shutdown implemented. In-progress detections are aborted on container stop. Background TensorRT conversion runs in a daemon thread (terminates with process). **Mitigation**: The `stop-services.sh` script sends SIGTERM with a 30-second grace period, allowing uvicorn to finish in-flight HTTP requests. Long-running video detections may be interrupted. ## Database Migrations Not applicable — the Detections service has no database. All persistence is delegated to the Annotations service.