detections/_docs/04_deploy/deployment_procedures.md

# Deployment Procedures

## Deployment Strategy

**Pattern**: Rolling deployment (single-instance service; blue-green not applicable without load balancer).

**Rationale**: The Detections service is a single-instance stateless service with in-memory state (`_active_detections`, `_event_queues`). Rolling replacement with graceful shutdown is the simplest approach.

**Zero-downtime**: Achievable only with load-balanced multi-instance setup. For single-instance deployments, brief downtime (~10-30s) during container restart is expected.

## Deployment Flow

```
1. Pull new images → pull-images.sh
2. Stop current services gracefully → stop-services.sh (30s grace)
3. Start new services → start-services.sh
4. Verify health → health-check.sh
5. If unhealthy → rollback (deploy.sh --rollback)
```

## Health Checks

| Check | Type | Endpoint | Interval | Threshold |
|-------|------|----------|----------|-----------|
| Liveness | HTTP GET | `/health` | 30s | 3 failures → restart |
| Readiness | HTTP GET | `/health` (check `aiAvailability` != "None") | 10s | Wait for engine init |

**Note**: The service's `/health` endpoint returns `{"status": "healthy"}` immediately, with `aiAvailability` transitioning through DOWNLOADING → CONVERTING → ENABLED on first inference request (lazy init). Readiness depends on the use case — the API is ready to accept requests immediately, but inference is only ready after engine initialization.

## Rollback Procedures

**Trigger criteria**:
- Health check fails after deployment (3 consecutive failures)
- Error rate spike detected in monitoring
- Critical bug reported by users

**Rollback steps**:
1. Run `deploy.sh --rollback` — redeploys the previous image tag
2. Verify health via `health-check.sh`
3. Notify stakeholders of rollback
4. Investigate root cause

**Previous image tag**: Saved by `stop-services.sh` to `.deploy-previous-tag` before each deployment.

## Deployment Checklist

Pre-deployment:
- [ ] All CI pipeline stages pass (lint, test, security, e2e)
- [ ] Security scan clean (zero critical/high CVEs)
- [ ] Environment variables configured on target
- [ ] `classes.json` available on target (or mounted volume)
- [ ] ONNX model available in Loader service
- [ ] Loader and Annotations services reachable from target

Post-deployment:
- [ ] `/health` returns 200
- [ ] `aiAvailability` transitions to ENABLED after first request
- [ ] SSE stream endpoint accessible
- [ ] Detection request returns valid results
- [ ] Logs show no errors

## Graceful Shutdown

**Current limitation**: No graceful shutdown implemented. In-progress detections are aborted on container stop. Background TensorRT conversion runs in a daemon thread (terminates with process).

**Mitigation**: The `stop-services.sh` script sends SIGTERM with a 30-second grace period, allowing uvicorn to finish in-flight HTTP requests. Long-running video detections may be interrupted.

## Database Migrations

Not applicable — the Detections service has no database. All persistence is delegated to the Annotations service.