mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 22:46:31 +00:00
104 lines
3.3 KiB
Markdown
104 lines
3.3 KiB
Markdown
# Deployment Procedures Template
|
|
|
|
Save as `_docs/04_deploy/deployment_procedures.md`.
|
|
|
|
---
|
|
|
|
```markdown
|
|
# [System Name] — Deployment Procedures
|
|
|
|
## Deployment Strategy
|
|
|
|
**Pattern**: [blue-green / rolling / canary]
|
|
**Rationale**: [why this pattern fits the architecture]
|
|
**Zero-downtime**: required for production deployments
|
|
|
|
### Graceful Shutdown
|
|
|
|
- Grace period: 30 seconds for in-flight requests
|
|
- Sequence: stop accepting new requests → drain connections → shutdown
|
|
- Container orchestrator: `terminationGracePeriodSeconds: 40`
|
|
|
|
### Database Migration Ordering
|
|
|
|
- Migrations run **before** new code deploys
|
|
- All migrations must be backward-compatible (old code works with new schema)
|
|
- Irreversible migrations require explicit approval
|
|
|
|
## Health Checks
|
|
|
|
| Check | Type | Endpoint | Interval | Failure Threshold | Action |
|
|
|-------|------|----------|----------|-------------------|--------|
|
|
| Liveness | HTTP GET | `/health/live` | 10s | 3 failures | Restart container |
|
|
| Readiness | HTTP GET | `/health/ready` | 5s | 3 failures | Remove from load balancer |
|
|
| Startup | HTTP GET | `/health/ready` | 5s | 30 attempts | Kill and recreate |
|
|
|
|
### Health Check Responses
|
|
|
|
- `/health/live`: returns 200 if process is running (no dependency checks)
|
|
- `/health/ready`: returns 200 if all dependencies (DB, cache, queues) are reachable
|
|
|
|
## Staging Deployment
|
|
|
|
1. CI/CD builds and pushes Docker images tagged with git SHA
|
|
2. Run database migrations against staging
|
|
3. Deploy new images to staging environment
|
|
4. Wait for health checks to pass (readiness probe)
|
|
5. Run smoke tests against staging
|
|
6. If smoke tests fail: automatic rollback to previous image
|
|
|
|
## Production Deployment
|
|
|
|
1. **Approval**: manual approval required via [mechanism]
|
|
2. **Pre-deploy checks**:
|
|
- [ ] Staging smoke tests passed
|
|
- [ ] Security scan clean
|
|
- [ ] Database migration reviewed
|
|
- [ ] Monitoring alerts configured
|
|
- [ ] Rollback plan confirmed
|
|
3. **Deploy**: apply deployment strategy (blue-green / rolling / canary)
|
|
4. **Verify**: health checks pass, error rate stable, latency within baseline
|
|
5. **Monitor**: observe dashboards for 15 minutes post-deploy
|
|
6. **Finalize**: mark deployment as successful or trigger rollback
|
|
|
|
## Rollback Procedures
|
|
|
|
### Trigger Criteria
|
|
|
|
- Health check failures persist after deploy
|
|
- Error rate exceeds 5% for more than 5 minutes
|
|
- Critical alert fires within 15 minutes of deploy
|
|
- Manual decision by on-call engineer
|
|
|
|
### Rollback Steps
|
|
|
|
1. Redeploy previous Docker image tag (from CI/CD artifact)
|
|
2. Verify health checks pass
|
|
3. If database migration was applied:
|
|
- Run DOWN migration if reversible
|
|
- If irreversible: assess data impact, escalate if needed
|
|
4. Notify stakeholders
|
|
5. Schedule post-mortem within 24 hours
|
|
|
|
### Post-Mortem
|
|
|
|
Required after every production rollback:
|
|
- Timeline of events
|
|
- Root cause
|
|
- What went wrong
|
|
- Prevention measures
|
|
|
|
## Deployment Checklist
|
|
|
|
- [ ] All tests pass in CI
|
|
- [ ] Security scan clean (zero critical/high CVEs)
|
|
- [ ] Docker images built and pushed
|
|
- [ ] Database migrations reviewed and tested
|
|
- [ ] Environment variables configured for target environment
|
|
- [ ] Health check endpoints verified
|
|
- [ ] Monitoring alerts configured
|
|
- [ ] Rollback plan documented and tested
|
|
- [ ] Stakeholders notified of deployment window
|
|
- [ ] On-call engineer available during deployment
|
|
```
|